Process of making a huge txt into array in Java - java

I got thousands of sentences on a txt file, and my first Android application should take one from there and put it on a textView.
I could put the txt file as a resource, or also, try to get all the sentences and convert it to an array. I don't want to put my txt into the application, but directly the array with the sentences. How could I automatically "translate" thousands of sentences to an array-like list?

Don't reinvent the wheel... this is a one-liner:
import org.apache.commons.io.IOUtils;
List<String> sentences = (List<String>)IOUtils.readLines(new FileInputStream("filename.txt"));

i guess this is what u dont want to do...
//InputStream is = getResources().openRawResource(R.raw.list);
so get a InputStream object and use following code
List<String> content=new ArrayList<String>();
InputStreamReader isr = new InputStreamReader(is);
linereader = new LineNumberReader(isr);
for (int i = 0; i < num; i++) { // num is total no of lines in file
try {
line = linereader.readLine();
content.add(line);
}
catch (IOException e)
{
e.printStackTrace();
}
}// for ends

If you know that each item only has one period, you split on that. If you can't do that but can use newlines, use that.
final String blob = "Quote 1. Quote2. Quote '3'.";
String[] quotes = blob.split('\\.');
> ["Quote 1", "Quote2", "Quote '3'"];
OR
final String blob = "Quote 1.\nQuote 2 longer.";
String[] quotes = blob.split("\n");
> ["Quote 1.", "Quote 2 longer."]

It sounds like you want to put the text file in your resources folder and then read it using a BufferedReader.
Depending on the contents of your text file, you could read the text line by line and add it to an array as you go, or you could just read the entire file as a string and use .split(), which would return an array of strings for you to use.

Related

Read file using delimiter and add to array

I am trying to read from a text file that is in my project workspace then;
Create an object depending on the first element on the first line of the file
Set some variables within the object
Then add it to my arrayList
I seem to be reading the file ok but am struggling to create the different objects based off what the first element on each line in the text file is
Text file is like this
ul,1,gg,0,33.0
sl,2,hh,0,44.0
My expected result is to create an UltimateLanding object or StrongLanding object based on the first element in the text above file example
Disclaimer - I know the .equals is not correct to use in the IF statement, i've tried many ways to resolve this
My Code -
Edited -
It seems the program is now reading the file and correctly and adding to the array. However, it is only doing this for the first line in the file? There should be 2 objects created as there are 2 lines in the text file.
Scanner myFile = new Scanner(fr);
String line;
myFile.useDelimiter(",");
while (myFile.hasNext()) {
line = myFile.next();
if (line.equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(myFile.next()));
sl.setLandingDesc(myFile.next());
sl.setNumLandings(Integer.parseInt(myFile.next()));
sl.setCost(Double.parseDouble(myFile.next()));
landings.add(sl);
} else if (line.equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(myFile.next()));
ul.setLandingDesc(myFile.next());
ul.setNumLandings(Integer.parseInt(myFile.next()));
ul.setCost(Double.parseDouble(myFile.next()));
landings.add(ul);
}
}
TIA
There are multiple issues with your current code.
myFile.equals("sl") compares your Scanner object with a String. You would actually want to compare your read string line, not your Scanner object. So line.equals("sl").
nextLine() will read the whole line. So line will never be equal to "sl". You should split the line using your specified delimiter, then use the split parts to build your object. This way, you will not have to worry about newline in combination with next().
Currently, your evaluation of the read input is outside of the while loop, so you will read all the content of the file, but only evaluate the last line (currently). You should move the evaluation of the input and creation of your landing objects inside the while loop.
All suggestions implemented:
...
Scanner myFile = new Scanner(fr);
// no need to specify a delimiter, since you want to read line by line
String line;
String[] splitLine;
while (myFile.hasNextLine()) {
line = myFile.nextLine();
splitLine = line.split(","); // split the line by ","
if (splitLine[0].equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(splitLine[1]));
sl.setLandingDesc(splitLine[2]);
sl.setNumLandings(Integer.parseInt(splitLine[3]));
sl.setCost(Double.parseDouble(splitLine[4]));
landings.add(sl);
} else if (splitLine[0].equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(splitLine[1]));
ul.setLandingDesc(splitLine[2]);
ul.setNumLandings(Integer.parseInt(splitLine[3]));
ul.setCost(Double.parseDouble(splitLine[4]));
landings.add(ul);
}
}
...
However, if you don't want to read the contents line by line (due to whatever requirement you have), you can keep reading it via next(), but you have to specify the delimiter correctly:
...
Scanner myFile = new Scanner(fr);
String line; // variable naming could be improved, since it's not the line
myFile.useDelimiter(",|\\n"); // comma and newline as delimiters
while (myFile.hasNext()) {
line = myFile.next();
if (line.equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(myFile.next()));
sl.setLandingDesc(myFile.next());
sl.setNumLandings(Integer.parseInt(myFile.next()));
sl.setCost(Double.parseDouble(myFile.next()));
landings.add(sl);
} else if (line.equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(myFile.next()));
ul.setLandingDesc(myFile.next());
ul.setNumLandings(Integer.parseInt(myFile.next()));
ul.setCost(Double.parseDouble(myFile.next()));
landings.add(ul);
}
}
...
A solution.
List<Landing> landings = Files.lines(Paths.get("LandingsData.txt")).map(line -> {
String[] split = line.split(",");
if (split[0].equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(split[1]));
sl.setLandingDesc(split[2]);
sl.setNumLandings(split[3]);
sl.setCost(Double.parseDouble(split[4]));
return sl;
} else if (split[0].equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(split[1]));
ul.setLandingDesc(split[2]);
ul.setNumLandings(split[3]);
ul.setCost(Double.parseDouble(split[4]));
return ul;
}
return null;
}).filter(t -> t!= null).collect(Collectors.toList());

RDF/XML for apache jena format using java

The error I get when I change list1[0] to list1[1]:
I am doing a progam that prints from a file using RDF model in java, I wanted to let the object to be as a Sting but i couldn’t find a way for it, I tried to make by using the 2-d array to let it reads from the file and print the data into the output screen. However, it doesn't work and I couldn't figure out the reason.
Here is my code:
String synonyms =null;
try {
File file1 = new File("Data/9687.txt");
FileReader fileReader1 = new FileReader(file1);
BufferedReader bufferedReader1 = new BufferedReader(fileReader1);
StringBuffer stringBuffer = new StringBuffer();
String line1;
System.out.println("Proteins & Synonyms:");
while ((bufferedReader1.readLine()) != null) {
line1 = bufferedReader1.readLine();
String[] list1 = line1.split(“/t”)
synonyms=model1.expandPrefix(list1[0]);
proteinG.addProperty(hasSynonyms,synonyms);
And here is the OUTPUT message shown:
<https://Bio2cv.net/ENSP000003488> <hasSynonyms> "ENSP000003488” .
The output for the resource is the same as the string.
Is the synonym name in the second column of the input file?
If so, you are using bad index 0 here:
synonyms=model1.expandPrefix(list1[0]);
Change it to 1 and also remove the model1.expandPrefix() call if you want a plain string literal:
synonyms=list1[1];
For skipping invalid lines (without tab character) change the code after the split() call. Check the length of the list1 array:
String[] list1 = line1.split("\t");
if (list1.length < 2) continue;
You are also reading two lines form the input instead of one.
Change this code:
while ((bufferedReader1.readLine()) != null) {
line1 = bufferedReader1.readLine();
to this:
while ((line1 = bufferedReader1.readLine()) != null) {

Searching a text file in java and Listing the results

I've really searched around for ideas on how to go about this, and so far nothing's turned up.
I need to search a text file via keywords entered in a JTextField and present the search results to a user in an array of columns, like how google does it. The text file has a lot of content, about 22,000 lines of text. I want to be able to sift through lines not containing the words specified in the JTextField and only present lines containing at least one of the words in the JTextField in rows of search results, each row being a line from the text file.
Anyone has any ideas on how to go about this? Would really appreciate any kind of help. Thank you in advance
You can read the file line by line and search in every line for your keywords. If you find one, store the line in an array.
But first split you text box String by whitespaces and create the array:
String[] keyWords = yourTextBoxString.split(" ");
ArrayList<String> results = new ArrayList<String>();
Reading the file line by line:
void readFileLineByLine(File file) {
BufferedReader br = new BufferedReader(new FileReader(file));
String line;
while ((line = br.readLine()) != null) {
processOneLine(line);
}
br.close();
}
Processing the line:
void processOneLine(String line) {
for (String currentKey : keyWords) {
if (line.contains(currentKey) {
results.add(line);
break;
}
}
}
I have not testst this, but you should get a overview on how you can do this.
If you need more speed, you can also use a RegularExpression to search for the keywords so you don't need this for loop.
Read in file, as per the Oracle tutorial, http://docs.oracle.com/javase/tutorial/essential/io/file.html#textfiles Iterate through each line and search for your keyword(s) using String's contain method. If it contains the search phrase, place the line and line number in a results List. When you've finished you can display the results list to the user.
You need a method as follows:
List<String> searchFile(String path, String match){
List<String> linesToPresent = new ArrayList<String>();
File f = new File(path);
FileReader fr;
try {
fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
String line;
do{
line = br.readLine();
Pattern p = Pattern.compile(match);
Matcher m = p.matcher(line);
if(m.find())
linesToPresent.add(line);
} while(line != null);
br.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return linesToPresent;
}
It searches a file line by line and checks with regex if a line contains a "match" String. If you have many Strings to check you can change the second parameter to String[] match and with a foreach loop check for each String match.
You can use :
FileUtils
This will read each line and return you a List<String>.
You can iterate over this List<String> and check whether the String contains the word entered by the user, if it contains, add it to another List<String>. then at the end you will be having another List<String> which contains all the lines which contains the word entered by the user. You can iterate this List<String> and display the result to the user.

Java - Splitting a CSV file into an Array

I have managed to split a CSV file based on the commas. I did this by placing a dummy String where ever there was a ',' and then splitting based on the dummy String.
However, the CSV file contains things such as:
something, something, something
something, something, something
Therefore, where there is a new line, the last and first values of each line get merged into their own string. How can I solve this? I've tried placing my dummy string where \n is found to split it based on that but to no success.
Help?!
I would strongly recommend you not reinventing the wheel :). Go with one of the already available libraries for handling CSV files, eg: OpenCSV
I don't see why you need a dummy string. Why not split on comma?
BufferedReader in = new BufferedReader(new FileReader("file.csv"));
String line;
while ((line = in.readLine()) != null) {
String[] fields = line.split(",");
}
As per the dummy strings you mentioned, it could be easily processed with the help of an existing library. I would like to recommand the open source library uniVocity-parsers, which procides simplfied API, significent performance and flexibility.
Just refer to few lines of code to read csv data into memory with array:
private static void parseCSV() throws FileNotFoundException {
CsvParser parser = new CsvParser(new CsvParserSettings());
List<String[]> parsedData = parser.parseAll(new FileReader("/examples/example.csv"));
for (String[] row : parsedData) {
StringBuilder strBuilder = new StringBuilder();
for (String col : row) {
strBuilder.append(col).append("\t");
}
System.out.println(strBuilder);
}
}
use the followin it will split lines
String[] a=scanner.next().split(" ");

Read two lines of CSV at a time (Java)

Is there a way to read two lines of a csv file at a time in Java?
I can read one at a time using Scanner (it has to be done like this):
String line = input.nextLine();
String[] nline = line.split ("[,]");
......
Here is some sample data and a short explanation. I need these read two at a time so I can can go about my other processing.
the first line that starts with "Create" creates a person
the second line "action" is the action of the created person
create,Mr. Jones,blah,blah
action,1,3
create,Mrs.Smith,blah,blah
action,4,10
....
Thanks in advance.
If you're looking to parse CSV files in Java I'd avoid line split via the string.split() method. You can run into issues if your field contains commas. For Java I'd recommend opencsv to parse the data. Similar to using the scanner you can read it in line by line, or slurp the entire file if it's not too large, and just iterate over the list to items at a time.
CSVReader reader = ...
String[] firstLine; // fields from first line
String[] secondLine; // fields from second line
while ((firstLine = reader.next()) != null && (secondLine = reader.next()) != null) {
// do something with two lines
}
Or
CSVReader reader = ...
List<String[]> allLines = reader.readAll();
// TODO: validate we have an even number of lines
for (int i = 0; i < allLines.size(); i += 2) {
String[] firstLine = allLines.get(i);
String[] secondLine = allLines.get(i+1);
// do something with two lines
}
String line = input.nextLine() + input.nextLine();
Cory's answer is good but for your next part to work
String[] nline = line.split ("[,]");
......
You need to add the comma in there
String line = input.nextLine() + "," + input.nextLine();

Categories

Resources