I'm coding in Java and I need to separate text that I get from a .txt into different parts of an Array. The text is composed by different "texts" like a collection of documents.
The line before each text is something like: "*TEXT" and some numbers, but I think, that with the only word "*TEXT" it can be divided each text.
An example of how is the .txt:
*TEXT 017 01/04/63 PAGE 020
THE ALLIES AFTER NASSAU IN DECEMBER 1960, THE U.S ........
*TEXT 020 01/04/63 PAGE 021
THE ROAD TO JAIL IS PAVED WITH NONOBJECTIVE ART SINCE THE KREMLIN'S SHARPEST BARBS THESE DAYS ARE AIMED AT MODERN ART AND WESTERN ESPIONAGE...
*TEXT 025 01/04/63 PAGE 024
RED CHINA FIXING FRONTIERS RED CHINA PRODUCED A SECOND SURPRISE LAST WEEK...
So I need the text 017 in a position of the array and in the next position will be the text 020.
How can I do this?
This is the code of how I get the text from the .txt using FileReader:
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import javax.swing.JFileChooser;
public class Reader{
public static void main(String args[]){
File inFile;
FileReader fr;
BufferedReader bufReader;
JFileChooser chooser;
int reply;
String doc = "";
String line;
try{
chooser = new JFileChooser();
reply = chooser.showOpenDialog(null);
doc = chooser.getCurrentDirectory().getPath() + System.getProperty("file.separator") +
chooser.getSelectedFile().getName();
inFile = new File(doc);
fr = new FileReader(inFile);
bufReader = new BufferedReader (fr);
do{
line = bufReader.readLine();
if(line ==null )
return;
else{
System.out.println(line);
}
} while(line!=null);
bufReader.close();
}//end try
catch(Exception e)
{ System.out.println("error: "+e.getMessage()); }
}//main
}//end class reader
You could just read the entire file into a String and then use String.split(String regex)
You can use FileUtils to read the file and then you can just split it, like this
public static void main(String[] args) throws IOException {
for (String s:FileUtils.readFileToString(new File("/home/leoks/file.txt")).split("\n")){
if (s.startsWith("*TEXT")) {
System.out.println(s.split(" ")[1]);
}
}
}
or you can write a parser using something like this
http://txt2re.com/index-java.php3?s=*TEXT%20017&-14&-1
Sorry guys, disregard my answer. I typed it so i'm leaving it but I thought he just wanted the text number that was after the "*TEXT" identifiers.
Try regular expressions and captures.
String text = "this will be your document text"
Pattern p = Pattern.compile("(.*TEXT ([0-9]{3}))+.*");
Matcher m = p.matcher(line);
int numCounts = m.groupCount();
String texts[] = new String[numCounts];
for (int i = 1; i <= numCounts; i++) {
// group(0) is whole match you want each group a 1
texts[i-1] = m.group(i);
}
// now they should be in your texts
OR you can do this:
String text = "this will be your document text"
Pattern p = Pattern.compile("TEXT ([0-9]{3})");
Matcher m = p.matcher(line);
ArrayList<String> list = new ArrayList<String>();
while (m.find()) {
list.add(m.group(1));
}
String texts[] = list.toArray();
// now they should be in your texts
Try this way
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(
new FileInputStream("yourTextFile")));
StringBuilder br = new StringBuilder();
String newLine ="";
while(true){
String line = bufferedReader.readLine();
if(line == null)
break;
br.append(line);
}
newLine = br.toString();
String arr[] = newLine.split("\\*TEXT");
System.out.println(java.util.Arrays.toString(arr));
Related
I am working on a project where I have to read the data from a file into my code, in the txt file I have columns of data, and I have managed to separate each column of data into an array with this code.
public static void main(String[] args) {
String line = "";
String date = "";
ArrayList<String> date = new ArrayList<String>();
try {
FileReader fr = new FileReader("list.txt");
BufferedReader br = new BufferedReader(fr);
while ((line = br.readLine()) != null) {
line.split("\\s+");
date.add(line.split("\\s+")[0]);
System.out.println(line.split("\\s+")[0]);
}
} catch (IOException e) {
System.out.println("File not found!");
}
This will output the first column of data from the "list.txt" file which is...
30-Nov-2016
06-Oct-2016
05-Feb-2016
04-Sep-2016
18-Apr-2016
09-Feb-2016
22-Oct-2016
20-Aug-2016
17-Dec-2016
25-Dec-2016
However, I want to count the occurrence of the word "Feb" so for example it will come up...
"The month February occurs: 2 times"
But I'm struggling to find the right code, could somebody please help me on this matter I've been trying for over 24 hours, any help will be greatly appreciated, I can't find any other questions that help me.
Another solution could be using split
String month = "Feb";
int count = 0;
while ((line = br.readLine()) != null)
{
String strDate = line.split("\\s+")[0]; // get first column, which has date
String temp = strDate.split("\\-")[1]; // get Month from extracted date.
if (month.equalsIgnoreCase(temp))
{
count++;
// or store strDate into List for further process.
}
}
System.out.println (count);// should print total occurrence of date with Feb month
==Edited==
Since, you are extracting date from each line using line.split("\\s+")[0], which means actual string, which only contains date would be extract string.
For simplicity, you could simply use a regular expression, something like...
Pattern p = Pattern.compile("Feb", Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("30-Nov-2016, 06-Oct-2016, 05-Feb-2016, 04-Sep-2016, 18-Apr-2016, 09-Feb-2016, 22-Oct-2016, 20-Aug-2016, 17-Dec-2016, 25-Dec-2016");
int count = 0;
while (m.find()) {
count++;
}
System.out.println("Count = " + count);
Which, based on the input, would be 2.
Now, obviously, if you're reading each value from a file one at a time, this is not that efficient, and simply using something like...
if (line.toLowerCase().concat("feb")) {
count++;
}
would be simple and quicker
Updated...
So, based on the provided input data and the following code...
Pattern p = Pattern.compile("Feb", Pattern.CASE_INSENSITIVE);
int count = 0;
try (BufferedReader br = new BufferedReader(new InputStreamReader(Test.class.getResourceAsStream("Data.txt")))) {
String text = null;
while ((text = br.readLine()) != null) {
Matcher m = p.matcher(text);
if (m.find()) {
count++;
}
}
System.out.println(count);
} catch (IOException ex) {
Logger.getLogger(Test.class.getName()).log(Level.SEVERE, null, ex);
}
It prints 67.
Now, this is brute force method, because I'm checking the whole line. In order to overcome possible mismatches in the text, you should split the line by the common delimiter (ie tab character) and check the first element, for example...
String[] parts = text.split("\t");
Matcher m = p.matcher(parts[0]);
I am trying to read data from a text file using a Buffered Reader. I'm trying to split the data into two Arrays, one of them is a double and the other one is a string. Below is the text file content:
55.6
Scholtz
85.6
Brown
74.9
Alawi
45.2
Weis
68.0
Baird
55
Baynard
68.5
Mills
65.1
Gibb
80.7
Grovner
87.6
Weaver
74.8
Kennedy
83.5
Landry.
Basically I'm trying to take all the numbers and put it into the double array, and take all the names and put it into the string array. Any ideas?
You could possibly get the entire string from the buffered reader and then use regex to parse out the digits and other data. A regex like \d+\.*\d should work to parse out the digits. And then a regex like [A-Za-z]+ should get all of the names. Then take each set of data from the regular expressions and split them into their respective arrays using .split("").
Try this:
String file = "path to file";
double dArr[] = new double[100];
String sArr[] = new String[100];
int i = 0, j = 0;
try {
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
String line;
while ((line = br.readLine()) != null) {
Pattern p = Pattern.compile("([0-9]*)\\.[0-9]*"); // should start with any number of 0-9 then "." and then any number of 0-9
Matcher m = p.matcher(line);
if (m.matches()) {
dArr[i] = Double.parseDouble(line);
i++;
} else {
sArr[j] = line;
j++;
}
}
} catch (IOException e) {
e.printStackTrace();
}
Suggestion: Try List instead of array if uncertain about number of elements
55 is treated as String as it is int
In my code I have two files in my drive those two files have some text and I want to display those string in the console and also remove the repeated string and display the repeated string once rather than displaying it twice.
Code:
public class read {
public static void main(String[] args) {
try{
File file = new File("D:\\file1.txt");
FileReader fileReader = new FileReader(file);
BufferedReader br = new BufferedReader(fileReader);
StringBuffer stringBuffer = new StringBuffer();
String line;
while((line = br.readLine()) != null){
stringBuffer.append(line);
stringBuffer.append("\n");
}
fileReader.close();
System.out.println("Contents of file1:");
String first = stringBuffer.toString();
System.out.println(first);
File file1 = new File("D:\\file2.txt");
FileReader fileReader1 = new FileReader(file1);
BufferedReader br1 = new BufferedReader(fileReader1);
StringBuffer stringBuffer1 = new StringBuffer();
String line1;
while((line1 = br1.readLine()) != null){
stringBuffer1.append(line1);
stringBuffer1.append("\n");
}
fileReader1.close();
System.out.println("Contents of file2:");
String second = stringBuffer1.toString();
System.out.println(second);
System.out.println("answer:");
System.out.println(first+second);
}catch (IOException e) {
// TODO: handle exception
e.printStackTrace();
}
}
}
Output is:
answer:
hi hello
how are you
hi ya
i am fine
But I want to compare both the strings and if the same string repeated then that string should be displayed once.
Output I expect is like this:
answer:
hi hello
how are you
ya
i am fine
Where the "hi" is found in both the strings so that I need to delete the one duplicate string.
How can I do that please help.
Thanks in advance.
You can pass your lines through this method to parse out duplicate words:
// store unique previous words
static Set<String> words = new HashSet<>();
static String removeDuplicateWords(String line) {
StringJoiner sj = new StringJoiner(" ");
// split on whitespace to get distinct words
for (String word : line.split("\\s+")) {
// try to add word to the set
if (words.add(word)) {
// if the word was added (=not seen before), append to the result
sj.add(word);
}
}
return sj.toString();
}
EDITED
Im trying to split a text into an Array. I have a .txt made by different text, like a collection of texts. I need the whole text of each different text in the .txt in a position of the Array.
Im recovering the text from the file with a JFileChooser. And then im trying to process it with "regex" String.Split and then trying to print it. "The first part of the FileChooser it works, but when trying to separate the text in the Array i dont know if its working, because the System.out does not print the expected Array whit all the texts."
This is an example of the .txt each text is separated by a "*TEXT".
*TEXT 017 01/04/63 PAGE 020
THE ALLIES AFTER NASSAU IN DECEMBER 1960, THE U.S ........
*TEXT 020 01/04/63 PAGE 021
THE ROAD TO JAIL IS PAVED WITH NONOBJECTIVE ART SINCE THE KREMLIN'S SHARPEST BARBS THESE DAYS ARE AIMED AT MODERN ART AND WESTERN ESPIONAGE...
*TEXT 025 01/04/63 PAGE 024
RED CHINA FIXING FRONTIERS RED CHINA PRODUCED A SECOND SURPRISE LAST WEEK...
An this is my code, first the FileChooser and then the String.Split
import java.io.*;
import java.lang.Object.*;
import java.util.regex.*;
import javax.swing.JFileChooser;
public class Reader{
public static void main(String args[]) throws IOException{
File inFile;
FileReader fr;
BufferedReader bufReader;
JFileChooser chooser;
int reply;
String doc = "";
String line;
try{
chooser = new JFileChooser();
reply = chooser.showOpenDialog(null);
doc = chooser.getCurrentDirectory().getPath() + System.getProperty("file.separator") + chooser.getSelectedFile().getName();
inFile = new File(doc);
fr = new FileReader(inFile);
bufReader = new BufferedReader (fr);
do{
line = bufReader.readLine();
if(line ==null )
return;
} while(line!=null);
//**HERE STARTS THE STRING.SPLIT**
//"line" at the end of next line it supposed to be the whole .txt
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(line)));
StringBuilder br = new StringBuilder();
String newLine ="";
while(true){
if(line == null)
break;
br.append(line);
}
newLine = br.toString();
String arr[] = newLine.split("\\*TEXT");
System.out.println(java.util.Arrays.toString(arr));
//**HERE ENDS**
bufReader.close();
}//end try
catch(Exception e)
{ System.out.println("error: "+e.getMessage()); }
}//main
}//end class reader
Thanks for your help! :3
First and foremost, your code never reaches the call to split. Your code has only two paths: the user cancels the file chooser dialog, which causes a NullPointerException, or a file is chosen, which inevitably hits the return statement in
...
do{
line = bufReader.readLine();
if(line ==null )
return;
} while(line!=null);
Second, the line
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(line)));
tries to open the file with name specified by the value of variable line which is probably not what you want.
A fixed version of your code would be:
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;
import javax.swing.JFileChooser;
public class Reader{
public static void main(String args[]) throws IOException{
File inFile;
FileReader fr;
BufferedReader bufReader;
JFileChooser chooser;
int reply;
try{
chooser = new JFileChooser();
reply = chooser.showOpenDialog(null);
// Read all the lines in the file at once
List<String> lines = Files.readAllLines(Paths.get(chooser.getSelectedFile().getAbsolutePath()), StandardCharsets.UTF_8);
// Merge the read lines into a String
StringBuilder sb = new StringBuilder();
for (String line : lines){
sb.append(line);
sb.append('\n');
}
String newLine = sb.toString();
// Split the String
String arr[] = newLine.split("\\*TEXT");
System.out.println(java.util.Arrays.toString(arr));
}//end try
catch(Exception e)
{ System.out.println("error: "+e.getMessage()); }
}//main
}//end class reader
Note the API classes Files and Paths are available starting Java API 1.7 only.
You can split the text using this regex:
^(?=\*TEXT)
Working demo
I am writing a Java program that inputs a test file, performs some modifications to the data, then writes it to a new file output.
The input text file looks like this...
url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels =2 format = S16le~
url = http://newstalk.fmstreams.com:8080 name = newstalk samplerate = 22050 channels = 1 format = S16le
The program needs to be able to change the samplerate to 44100, and the channels to 1, if they don't already have these values. I would also remove the url and name pieces completely. After these changes, the new line needs to be written out to a different output text file.
So far, all my program can do is select a file and display the contents of the file to the user. Could someone please point me in the right direction for how my program should work
to achieve my required outcome.
As somebody asked here is what I have so far
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import javax.swing.JFileChooser;
import javax.swing.JFrame;
public class reader2 {
public reader2() {
}
public static void main(String[] args) {
reader(args);
}
public static void reader(String[] args) {
JFileChooser chooser = new JFileChooser();
chooser.setCurrentDirectory(new File("."));
chooser.setFileFilter(new javax.swing.filechooser.FileFilter() {
public boolean accept(File f) {
return f.getName().toLowerCase().endsWith(".txt")
|| f.isDirectory();
}
public String getDescription() {
return "Text Documents (.txt)";
}
});
int r = chooser.showOpenDialog(new JFrame());
if (r == JFileChooser.APPROVE_OPTION) {
String name = chooser.getSelectedFile().getName();
String pathToFIle = chooser.getSelectedFile().getPath();
System.out.println(name);
try{
BufferedReader reader = new BufferedReader( new FileReader( pathToFIle ) ); //Setup the reader
while (reader.ready()) { //While there are content left to read
String line = reader.readLine(); //Read the next line from the file
String[] tokens = line.split( "url = " ); //Split the string at every # character. Place the results in an array.
for (String token : tokens){ //Iterate through all of the found results
//System.out.println(token);
System.out.println(token);
}
}
reader.close(); //Stop using the resource
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
}
You will need to do something like this ...
Read the contents of the file, one line at a time
Split the line up into the individual components, such as splitting it on the 'space' character
Change the sample rate and channel values according to your question
Write the line out to a file, and start again from step 1.
If you give this a try, post some code on StackExchange with any problems and we'll try to assist.
can you try
File file = new File( fileName );
File tempFile = File.createTempFile("buffer", ".tmp");
FileWriter fw = new FileWriter(tempFile);
Reader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
while(br.ready()) {
String line = br.readLine();
String newLine = line.replaceAll( "samplerate =\\s*\\d+", "samplerate = 44100");
newLine = newLine.replaceAll( "channels =\\s*\\d+", "channels = 1");
fw.write(newLine + "\n");
}
fw.close();
br.close();
fr.close();
// Finally replace the original file.
tempFile.renameTo(file);
Ref: Files java replacing characters