How do I read multiple tokens between characters in java?

How do I read multiple tokens between characters in java? - java

I am doing a project where I need to read multiple lines that contains user data from a txt file. This data will create a profile.
For example:
name,lastname,email,hobbies1;hobbies2...hobbiesN,activity1;activity2...activityN
name2,lastname2,email.... and so on
I don't know how many hobbies or activities are there so I have to set them into an array. All of variables are on one line.
I tried using delimiter and split, but when I move onto the next line I get inputMismatchException.

Simplest way is to change the format.
Instead of separating every field with , try using ; for separating different types of attributes and , for elements of arrays. The end result will be something like:
name; lastname; email; hobbies1, hobbies2, ..., hobbiesN; activity1, activity2, ..., activityN
First you split the String using ; as delimiter, then for those fields that allows for arrays you divide the array in its elements by splitting that subString with , as delimiter.

Read lines, then split on comma, and split on semi-colon where needed.
Don't use Scanner for line-reading a file.
try (BufferedReader in = Files.newBufferedReader(Paths.get("test.txt"))) {
for (String line; (line = in.readLine()) != null; ) {
String[] fields = line.split(",");
String name = (fields.length >= 1 ? fields[0] : "");
String lastname = (fields.length >= 2 ? fields[1] : "");
String email = (fields.length >= 3 ? fields[2] : "");
String[] hobbies = (fields.length >= 4 ? fields[3].split(";") : new String[0]);
String[] activities = (fields.length >= 5 ? fields[4].split(";") : new String[0]);
System.out.println("name=" + name +
", lastname=" + lastname +
", email=" + email +
", hobbies=" + Arrays.toString(hobbies) +
", activities=" + Arrays.toString(activities));
}
}
test.txt
name,lastname,email,hobbies1;hobbies2...hobbiesN,activity1;activity2...activityN
name2,lastname2,email
Output
name=name, lastname=lastname, email=email, hobbies=[hobbies1, hobbies2...hobbiesN], activities=[activity1, activity2...activityN]
name=name2, lastname=lastname2, email=email, hobbies=[], activities=[]

Related

Can not count how many number of unique date are available in every part of string

I divided my string in three part using newline ('\n'). The output that i want to achieve: count how many number of unique date are available in every part of string.
According to below code, first part contains two unique date, second part contains two and third part contains three unique date. So the output should be like this: 2,2,3,
But after run this below code i get this Output: 5,5,5,5,1,3,1,
How do i get Output: 2,2,3,
Thanks in advance.
String strH;
String strT = null;
StringBuilder sbE = new StringBuilder();
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11," + '\n' +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15," + '\n' +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09,";
String[] strG = strA.split("\n");
for(int h=0; h<strG.length; h++){
strH = strG[h];
String[] words=strH.split(",");
int wrc=1;
for(int i=0;i<words.length;i++) {
for(int j=i+1;j<words.length;j++) {
if(words[i].equals(words[j])) {
wrc=wrc+1;
words[j]="0";
}
}
if(words[i]!="0"){
sbE.append(wrc).append(",");
strT = String.valueOf(sbE);
}
wrc=1;
}
}
Log.d("TAG", "Output: "+strT);

I would use a set here to count the duplicates:
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11" + "\n" +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15" + "\n" +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09";
String[] lines = strA.split("\n");
List<Integer> counts = new ArrayList<>();
for (String line : lines) {
counts.add(new HashSet<String>(Arrays.asList(line.split(","))).size());
}
System.out.println(counts); // [2, 2, 3]
Note that I have done a minor cleanup of the strA input by removing the trailing comma from each line.

With Java 8 Streams, this can be done in a single statement:
String strA = "2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-02,2021-03-11,2021-03-11,2021-03-11,2021-03-11,2021-03-11," + '\n' +
"2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-07,2021-03-15,2021-03-15,2021-03-15,2021-03-15,2021-03-15," + '\n' +
"2021-03-02,2021-03-09,2021-03-07,2021-03-09,2021-03-09,";
String strT = Pattern.compile("\n").splitAsStream(strA)
.map(strG -> String.valueOf(Pattern.compile(",").splitAsStream(strG).distinct().count()))
.collect(Collectors.joining(","));
System.out.println(strT); // 2,2,3
Note that Pattern.compile("\n").splitAsStream(strA) can also be written as Arrays.stream(strA.split("\n")), which is shorter to write, but creates an unnecessary intermediate array. Matter of personal preference which is better.
String strT = Arrays.stream(strA.split("\n"))
.map(strG -> String.valueOf(Arrays.stream(strG.split(",")).distinct().count()))
.collect(Collectors.joining(","));
The first version can be further micro-optimized by only compiling the regex once:
Pattern patternComma = Pattern.compile(",");
String strT = Pattern.compile("\n").splitAsStream(strA)
.map(strG -> String.valueOf(patternComma.splitAsStream(strG).distinct().count()))
.collect(Collectors.joining(","));

Java - String splitting

I read a txt with data in the following format: Name Address Hobbies
Example(Bob Smith ABC Street Swimming)
and Assigned it into String z
Then I used z.split to separate each field using " " as the delimiter(space) but it separated Bob Smith into two different strings while it should be as one field, same with the address. Is there a method I can use to get it in the particular format I want?
P.S Apologies if I explained it vaguely, English isn't my first language.
String z;
try {
BufferedReader br = new BufferedReader(new FileReader("desc.txt"));
z = br.readLine();
} catch(IOException io) {
io.printStackTrace();
}
String[] temp = z.split(" ");

If the format of name and address parts is fixed to consist of two parts, you could just join them:
String z = ""; // z must be initialized
// use try-with-resources to ensure the reader is closed properly
try (BufferedReader br = new BufferedReader(new FileReader("desc.txt"))) {
z = br.readLine();
} catch(IOException io) {
io.printStackTrace();
}
String[] temp = z.split(" ");
String name = String.join(" ", temp[0], temp[1]);
String address = String.join(" ", temp[2], temp[3]);
String hobby = temp[4];
Another option could be to create a format string as a regular expression and use it to parse the input line using named groups (?<group_name>capturing text):
// use named groups to define parts of the line
Pattern format = Pattern.compile("(?<name>\\w+\\s\\w+)\\s(?<address>\\w+\\s\\w+)\\s(?<hobby>\\w+)");
Matcher match = format.matcher(z);
if (match.matches()) {
String name = match.group("name");
String address = match.group("address");
String hobby = match.group("hobby");
System.out.printf("Input line matched: name=%s address=%s hobby=%s%n", name, address, hobby);
} else {
System.out.println("Input line not matching: " + z);
}

I can think of three solutions.
In order from best to worst:
Different delimiter
Enforce the format to always have two names, two address parts and one hobby
Have a dictionary with names and hobbies, check each word to determine which type it is and then group them together as needed.
(The 3rd option is not meant as a serious alternative.)

As others have mentioned, using spaces as both field delimiter and inside fields is problematic. You could use a regex pattern to split the line (paste (\w+ \w+) (\w+ \w+) (.+) in Regex101 for an explanation):
Pattern pattern = Pattern.compile("(\\w+ \\w+) (\\w+ \\w+) (.+)");
Matcher matcher = pattern.matcher("Bob Smith ABC Street Bowling Fishing Rollerblading");
System.out.println("matcher.matches() = " + matcher.matches());
for (int i = 0; i <= matcher.groupCount(); i++) {
System.out.println("matcher.group(" + i + ") = " + matcher.group(i));
}
This would give the following output:
matcher.matches() = true
matcher.group(0) = Bob Smith ABC Street Bowling Fishing Rollerblading
matcher.group(1) = Bob Smith
matcher.group(2) = ABC Street
matcher.group(3) = Bowling Fishing Rollerblading
However this only works for this exact format. If you get a line with three name parts for example:
John B Smith ABC Street Swimming
This will get split into John B as the name, Smith ABC as the address and Street Swimming as hobbies.
So either make 100% sure your input will always match this format or use a different delimiter.

The split() method majorly works on the 2 things:
Delimiter and
The String Object
Sometimes on limit too.
Whatever limit you will provide, the split() method will do its work according to that.
It doesn't understand whether the left substring is a name or not, same as for the right substring.
Have a look at this code snippet:
String assets = "Gold:Stocks:Fixed Income:Commodity:Interest Rates";
String[] splits = assets.split(":");
System.out.println("splits.size: " + splits.length);
for(String asset: splits){
System.out.println(assets);
}
OutPut
splits.size: 5
Gold
Stocks
Fixed Income // with space
Commodity
Interest Rates // with space
The output came with spaces because I provided the ; as a delimiter.
This probably helped you to get your answer.
Find Detailed Information on Split():
Top 5 Use cases of Split()
Java Docs : Split()

It depends on the data you're dealing with. Will the name always consist of a first and last name? Then you can simply combine the first two elements from the resulting array into a new string.
Otherwise, you might have to find a different way to separate out the different pieces within the txt file. Possibly a comma? Some character that you know won't ever be used in your normal data.

Assuming that every line follows the format
Bob Smith ABC Street Swimming
ie, name surname.... this code can manually manipulate the data for you:
String[] temp = z.split(" ");
String[] temp2 = new String[temp.length - 1];
temp2[0] = temp[0] + " " + temp[1];
for (int i = 2; i < temp.length; i++) {
temp2[i] = temp2[i];
}
temp = temp2;

How to pattern match and transform string to generate certain output?

The below code is for getting some form of input which includes lots of whitespace in between important strings and before and after the important strings, so far I have been able to filter the whitespace out. After preparing the string what I want to do is process it.
Here is an example of the inputs that I may get and the favorable output I want;
Input
+--------------+
EDIT example.mv Starter web-onyx-01.example.net.mv
Notice how whitespace id before and after the domain, this whitespace could be concluded as random amount.
Output
+--------------+
example.mv. in ns web-onyx-01.example.net.mv.
In the output the important bit is the whitespace between the domain (Example.) and the keyword (in) and keyword (ns) and host (web-onyx-01.example.net.mv.)
Also notice the period (".") after the domain and host. Another part is the fact that if its a (.mv) ccTLD we will have to remove that bit from the string,
What I would like to achieve is this transformation with multiple lines of text, meaning I want to process a bunch of unordered chaotic list of strings and batch process them to produce the clean looking outputs.
The code is by no-means any good design, but this is at least what I have come up with. NOTE: I am a beginner who is still learning about programming. I would like your suggestions to improve the code as well as to solve the problem at hand i.e transform the input to the desired output.
P.S The output is for zone files in DNS, so errors can be very problematic.
So far my code is accepting text from a textarea and outputs the text into another textarea which shows the output.
My code works for as long as the array length is 2 and 3 but fails at anything larger. So how do I go about being able to process the input to the output dynamically for as big as the list/array may become in the future?
String s = jTextArea1.getText();
Pattern p = Pattern.compile("ADD|EDIT|DELETE|Domain|Starter|Silver|Gold|ADSL Business|Pro|Lite|Standard|ADSL Multi|Pro Plus", Pattern.MULTILINE);
Matcher m = p.matcher(s);
s = m.replaceAll("");
String ms = s.replaceAll("(?m)(^\\s+|[\\t\\f ](?=[\\t\\f ])|[\\t\\f ]$|\\s+\\z)", "");
String[] last = ms.split(" ");
for (String test : last){
System.out.println(test);
}
System.out.println("The length of array is: " +last.length);
if (str.isContain(last[0], ".mv")) {
if (last.length == 2) {
for(int i = 0; i < last.length; i++) {
last[0] = last[0].replaceFirst(".mv", "");
System.out.println(last[0]);
last[i] += ".";
if (last[i] == null ? last[0] == null : last[i].equals(last[0])) {
last[i]+= " in ns ";
}
String str1 = String.join("", last);
jTextArea2.setText(str1);
System.out.println(str1);
}
}
else if (last.length == 3) {
for(int i = 0; i < last.length; i++) {
last[0] = last[0].replaceFirst(".mv", "");
System.out.println(last[0]);
last[i] += ".";
if (last[i] == null ? last[0] == null : last[i].equals(last[0])) {
last[i]+= " in ns ";
}
if (last[i] == null ? last[1] == null : last[i].equals(last[1])){
last[i] += "\n";
}
if (last[i] == null ? last[2] == null : last[i].equals(last[2])){
last[i] = last[0] + last[2];
}
String str1 = String.join("", last);
jTextArea2.setText(str1);
System.out.println(str1);
}
}
}

As I understand your question you have multiple lines of input in the following form:
whitespace[command]whitespace[domain]whitespace[label]whitespace[target-domain]whitespace
You want to convert that to the following form such that multiple lines are aligned nicely:
[domain]. in ns [target-domain].
To do that I'd suggest the following:
Split your input into multiple lines
Use a regular expression to check the line format (e.g. for a valid command etc.) and extract the domains
store the maximum length of both domains separately
build a string format using the maximum lengths
iterate over the extraced domains and build a string for that line using the format defined in step 4
Example:
String input = " EDIT domain1.mv Starter example.domain1.net.mv \n" +
" DELETE long-domain1.mv Silver long-example.long-domain1.net.mv \n" +
" ADD short-domain1.mv ADSL Business ex.sdomain1.net.mv \n";
//step 1: split the input into lines
String[] lines = input.split( "\n" );
//step 2: build a regular expression to check the line format and extract the domains - which are the (\S+) parts
Pattern pattern = Pattern.compile( "^\\s*(?:ADD|EDIT|DELETE)\\s+(\\S+)\\s+(?:Domain|Starter|Silver|Gold|ADSL Business|Pro|Lite|Standard|ADSL Multi|Pro Plus)\\s+(\\S+)\\s*$" );
List<String[]> lineList = new LinkedList<>();
int maxLengthDomain = 0;
int maxLengthTargetDomain = 0;
for( String line : lines )
{
//step 2: check the line
Matcher matcher = pattern.matcher( line );
if( matcher.matches() ) {
//step 2: extract the domains
String domain = matcher.group( 1 );
String targetDomain = matcher.group( 2 );
//step 3: get the maximum length of the domains
maxLengthDomain = Math.max( maxLengthDomain, domain.length() );
maxLengthTargetDomain = Math.max( maxLengthTargetDomain, targetDomain.length() );
lineList.add( new String[] { domain, targetDomain } );
}
}
//step 4: build the format string with variable lengths
String formatString = String.format( "%%-%ds in ns %%-%ds", maxLengthDomain + 5, maxLengthTargetDomain + 2 );
//step 5: build the output
for( String[] line : lineList ) {
System.out.println( String.format( formatString, line[0] + ".", line[1] + "." ) );
}
Result:
domain1.mv. in ns example.domain1.net.mv.
long-domain1.mv. in ns long-example.long-domain1.net.mv.
short-domain1.mv. in ns ex.sdomain1.net.mv.

How to remove blank lines in middle of a string Android

String Address[] = mSelectedaddress.split("\\|");
address.setText(
Address[1] + "\n"
+ Address[2] + "\n"
+ Address[3] + "\n"
+ Address[4]);
Actual Output:
Address 1
Address 2
=> Blank line
City
Wanted Output:
Address 1
Address 2
City
If u can see my above code there are some scenario where Address[positon] may return blank text that time how can i remove that line if it is blank.

String adjusted = adress.replaceAll("(?m)^[ \t]*\r?\n", "");

When you build your string, check to see if the string is empty before you add it.
StringBuilder builder = new StringBuilder();
for(int it = 0; i < Address.length; i++) {
if(Address[i] != "")
builder.append(Address[i]);
}
address.setText(builder.toString());
}

The simplest thing I can think of that should do the trick most of the time:
mSelectedaddress.replaceAll("[\\|\\s]+", "|").split("\\|");
This will remove multiple |'s (with or without spaces) in a row. Those are the cause of your empty lines.
Example:
"a|b|c|d|e||g" -> works
"a|b|c|d|e| |g" -> works
"a|b|c|d|e|||g" -> works

extract data column-wise from text file using Java

I'm working under Java and want to extract data according to column from a text file.
"myfile.txt" contents:
ID SALARY RANK
065 12000 1
023 15000 2
035 25000 3
076 40000 4
I want to extract the data individually according to any Column i.e ID, SALARY, RANK etc
Basically I want to perform operations on individual data according to columns.
I've listed the data from "myfile.txt" by using while loop and reading line-by-line:
while((line = b.readLine()) != null) {
stringBuff.append(line + "\n");
}
link: Reading selective column data from a text file into a list in Java
Under bove link it is written to use the following:
String[] columns = line.split(" ");
But how to use it correctly, please any hint or help?

You can use a regex to detect longer spaces, example:
String text = "ID SALARY RANK\n" +
"065 12000 1\n" +
"023 15000 2\n" +
"035 25000 3\n" +
"076 40000 4\n";
Scanner scanner = new Scanner(text);
//reading the first line, always have header
//I suppose
String nextLine = scanner.nextLine();
//regex to break on any ammount of spaces
String regex = "(\\s)+";
String[] header = nextLine.split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example header[0], header[1], header[2]...
System.out.println(Arrays.toString(header));
//reading the rows
while (scanner.hasNext()) {
String[] row = scanner.nextLine().split(regex);
//this is printing all columns, you can
//access each column from row using the array
//indexes, example row[0], row[1], row[2]...
System.out.println(Arrays.toString(row));
System.out.println(row[0]);//first column (ID)
}

while((line = b.readLine()) != null) {
String[] columns = line.split(" ");
System.out.println("my first column : "+ columns[0] );
System.out.println("my second column : "+ columns[1] );
System.out.println("my third column : "+ columns[2] );
}
Now instead of System.out.println, do whatever you want with your columns.
But I think your columns are separated by tabs so you might want to use split("\t") instead.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How do I read multiple tokens between characters in java? - java

Related

Can not count how many number of unique date are available in every part of string

Java - String splitting

How to pattern match and transform string to generate certain output?

How to remove blank lines in middle of a string Android

extract data column-wise from text file using Java

Categories

Resources