Reading a .txt file and excluding certain elements - java

In my journey to complete this program I've run into a little hitch with one of my methods. The method I am writing reads a certain .txt file and creates a HashMap and sets every word found as a Key and the amount of time it appears is its Value. I have managed to figure this out for another method, but this time, the .txt file the method is reading is in a weird format. Specifically:
more 2
morning's 1
most 3
mostly 1
mythology. 1
native 1
nearly 2
northern 1
occupying 1
of 29
off 1
And so on.
Right now, the method is returning only one line in the file.
Here is my code for the method:
public static HashMap<String,Integer> readVocabulary(String fileName) {
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap();
String toRead = fileName;
try {
FileReader reader = new FileReader(toRead);
BufferedReader br = new BufferedReader(reader);
// The BufferedReader reads the lines
String line = br.readLine();
// Split the line into a String array to loop through
String[] words = line.split(" ");
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
br.close();
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}
The issue is that I'm not sure how to get the method to ignore the 2's and 1's and 29's of the .txt file. I attempted making an 'else if' statement to catch all of these cases but there are too many. Is there a way for me to catch all the ints from say, 1-100, and exlude them from being Keys in the HashMap? I've searched online but have turned up something.
Thank you for any help you can give!

How about just doing wordCount.put(words[0],1) into wordcount for every line, after you've done the split. If the pattern is always "word number", you only need the first item from the split array.
Update after some back and forth
public static HashMap<String,Integer> readVocabulary(String toRead)
{
// Declare the HashMap to be returned
HashMap<String, Integer> wordCount = new HashMap<String, Integer>();
String line = null;
String[] words = null;
int lineNumber = 0;
FileReader reader = null;
BufferedReader br = null;
try {
reader = new FileReader(toRead);
br = new BufferedReader(reader);
// Split the line into a String array to loop through
while ((line = br.readLine()) != null) {
lineNumber++;
words = line.split(" ");
if (words.length == 2) {
if (wordCount.containsKey(words[0]))
{
int n = wordCount.get(words[0]);
wordCount.put(words[0], ++n);
}
// Otherwise, puts the word into the HashMap
else
{
boolean word2IsInteger = true;
try
{
Integer.parseInt(words[1]);
}
catch(NumberFormatException nfe)
{
word2IsInteger = false;
}
if (word2IsInteger) {
wordCount.put(words[0], Integer.parseInt(words[1]));
}
}
}
}
br.close();
br = null;
reader.close();
reader = null;
}
// Catching the file not found error
// and any other errors
catch (FileNotFoundException fnfe) {
System.err.println("File not found.");
}
catch (Exception e) {
System.err.print(e);
}
return wordCount;
}

To check if a String contains a only digits use StringĀ“s matches() method, e.g.
if (!words[i].matches("^\\d+$")){
// NOT a String containing only digits
}
This wont require checking exceptions and it doesnt matter if the number wouldnt fit inside an Integer.

Option 1: Ignore numbers separated by whitespace
Use Integer.parseInt() or Double.parseInt() and catch the exception.
// for loop goes through every word
for (int i = 0; i < words.length; i++) {
try {
int wordAsInt = Integer.parseInt(words[i]);
} catch(NumberFormatException e) {
// Case if the HashMap already contains the key.
// If so, just increments the value.
if (wordCount.containsKey(words[i])) {
int n = wordCount.get(words[i]);
wordCount.put(words[i], ++n);
}
// Otherwise, puts the word into the HashMap
else {
wordCount.put(words[i], 1);
}
}
}
There is a Double.parseDouble(String) method, which you could use in place of Integer.parseInt(String) above if you wanted to eliminate all numbers, not just integers.
Option 2: Ignore numbers everywhere
Another option is to parse your input one character at a time and ignore any character that isn't a letter. When you scan whitespace, then you could add the word generated by the characters just scanned in to your HashMap. Unlike the methods mentioned above, scanning by character would allow you to ignore numbers even if they appear immediately next to other characters.

Related

Java : How do I print an ascending column and next to that column the same set of integers except in descending order all in one single text file

I need some help in how to do a certain step as I can not seem to figure it out.
I was given a text file with 100 numbers in it all random, I am supposed to sort them either in ascending order, descending order, or both depending on the user input. Then which ever the user inputs the set of integers will be sorted and printed in a text file. I am having trouble printing the both file. Here is my code up until the both statement.
public static void print(ArrayList<Integer> output, String destination){
try {
PrintWriter print = new PrintWriter(destination);
for(int i = 0; i < output.size(); i++){
print.print(output.get(i) + " ");
}
print.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
BufferedReader br = null;
ArrayList<Integer> words = new ArrayList<>();
BufferedReader reader;
String numbers;
try {
reader = new BufferedReader(new FileReader("input.txt"));
while((numbers = reader.readLine()) != null)
{
words.add(Integer.parseInt(numbers));
}
System.out.println("How would you like to sort?");
System.out.println("Please enter asc(For Ascending), desc(For Decending), or both");
String answer = input.next();
Collections.sort(words);
if(answer.equals("asc")){
Collections.sort(words);
System.out.println(words);
print(words,"asc.txt");
}
else if(answer.equals("desc")){
Collections.reverse(words);
System.out.println(words);
print(words,"desc.txt");
When I type in "both" the text file that is created only has one column set of integers that is going in descending order, not both and I have no idea how to print both sets. If someone could shed some light I would really appreciate it.
else if(answer.equals("both")){
System.out.println(words);
print(words,"both.txt");
Collections.reverse(words);
System.out.println(words);
print(words,"both.txt");
You need to use FileOutputStreams#Constructor where you can pass a boolean value to tell whether to append to my file or not.
So use like this:
PrintWriter print = new PrintWriter(new FileOutputStream(destination, true));
/\
||
||
To append to the file
From JavaDocs
public FileOutputStream(File file,
boolean append)
throws FileNotFoundException
Parameters:
file - the file to be opened for writing.
append - if true, then bytes will be written to the end of the file
rather than the beginning

Read the each string text from file in java

I am new in java. I just wants to read each string in java and print it on console.
Code:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = new String();
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
System.out.println(""+data);
}
} catch (IOException e) {
// Error
}
}
If file contains:
Add label abc to xyz
Add instance cdd to pqr
I want to read each word from file and print it to a new line, e.g.
Add
label
abc
...
And afterwards, I want to extract the index of a specific string, for instance get the index of abc.
Can anyone please help me?
It sounds like you want to be able to do two things:
Print all words inside the file
Search the index of a specific word
In that case, I would suggest scanning all lines, splitting by any whitespace character (space, tab, etc.) and storing in a collection so you can later on search for it. Not the question is - can you have repeats and in that case which index would you like to print? The first? The last? All of them?
Assuming words are unique, you can simply do:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
ArrayList<String> words = new ArrayList<String>();
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = null;
while ((data = infile.readLine()) != null) {
for (String word : data.split("\\s+") {
words.add(word);
System.out.println(word);
}
}
} catch (IOException e) {
// Error
}
// search for the index of abc:
for (int i = 0; i < words.size(); i++) {
if (words.get(i).equals("abc")) {
System.out.println("abc index is " + i);
break;
}
}
}
If you don't break, it'll print every index of abc (if words are not unique). You could of course optimize it more if the set of words is very large, but for a small amount of data, this should suffice.
Of course, if you know in advance which words' indices you'd like to print, you could forego the extra data structure (the ArrayList) and simply print that as you scan the file, unless you want the printings (of words and specific indices) to be separate in output.
Split the String received for any whitespace with the regex \\s+ and print out the resultant data with a for loop.
public static void main(String[] args) { // Don't make main throw an exception
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(fstream));
String data;
while ((data = infile.readLine()) != null) {
String[] words = data.split("\\s+"); // Split on whitespace
for (String word : words) { // Iterate through info
System.out.println(word); // Print it
}
}
} catch (IOException e) {
// Probably best to actually have this on there
System.err.println("Error found.");
e.printStackTrace();
}
}
Just add a for-each loop before printing the output :-
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
for(String temp : data.split(" "))
System.out.println(temp); // no need to concatenate the empty string.
}
This will automatically print the individual strings, obtained from each String line read from the file, in a new line.
And afterwards, I want to extract the index of a specific string, for
instance get the index of abc.
I don't know what index are you actually talking about. But, if you want to take the index from the individual lines being read, then add a temporary variable with count initialised to 0.
Increment it till d equals abc here. Like,
int count = 0;
for(String temp : data.split(" ")){
count++;
if("abc".equals(temp))
System.out.println("Index of abc is : "+count);
System.out.println(temp);
}
Use Split() Function available in Class String.. You may manipulate according to your need.
or
use length keyword to iterate throughout the complete line
and if any non- alphabet character get the substring()and write it to the new line.
List<String> words = new ArrayList<String>();
while ((data = infile.readLine()) != null) {
for(String d : data.split(" ")) {
System.out.println(""+d);
}
words.addAll(Arrays.asList(data));
}
//words List will hold all the words. Do words.indexOf("abc") to get index
if(words.indexOf("abc") < 0) {
System.out.println("word not present");
} else {
System.out.println("word present at index " + words.indexOf("abc"))
}

How can I read from the next line of a text file, and pause, allowing me to read from the line after that later?

I wrote a program that generates random numbers into two text files and random letters into a third according the two constant files. Now I need to read from each text file, line by line, and put them together. The program is that the suggestion found here doesn't really help my situation. When I try that approach it just reads all lines until it's done without allowing me the option to pause it, go to a different file, etc.
Ideally I would like to find some way to read just the next line, and then later go to the line after that. Like maybe some kind of variable to hold my place in reading or something.
public static void mergeProductCodesToFile(String prefixFile,
String inlineFile,
String suffixFile,
String productFile) throws IOException
{
try (BufferedReader br = new BufferedReader(new FileReader(prefixFile)))
{
String line;
while ((line = br.readLine()) != null)
{
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(productFile, true))))
{
out.print(line); //This will print the next digit to the right
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
}
}
}
}
EDIT: The digits being created according to the following. Basically, constants tell it how many digits to create in each line and how many lines to create. Now I need to combine these together without deleting anything from either text file.
public static void writeRandomCodesToFile(String codeFile,
char fromChar, char toChar,
int numberOfCharactersPerCode,
int numberOfCodesToGenerate) throws IOException
{
for (int i = 1; i <= PRODUCT_COUNT; i++)
{
int I = 0;
if (codeFile == "inline.txt")
{
for (I = 1; I <= CHARACTERS_PER_CODE; I++)
{
int digit = (int)(Math.random() * 10);
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.print(digit); //This will print the next digit to the right
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
if ((codeFile == "prefix.txt") || (codeFile == "suffix.txt"))
{
for (I = 1; I <= CHARACTERS_PER_CODE; I++)
{
Random r = new Random();
char digit = (char)(r.nextInt(26) + 'a');
digit = Character.toUpperCase(digit);
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.print(digit);
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
//This will take the text file to the next line
if (I >= CHARACTERS_PER_CODE)
{
{
Random r = new Random();
char digit = (char)(r.nextInt(26) + 'a');
try (PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(codeFile, true))))
{
out.println(""); //This will return a new line for the next loop
}
catch (FileNotFoundException e)
{
System.err.println("File error: " + e.getMessage());
System.exit(1);
}
}
}
}
System.out.println(codeFile + " was successfully created.");
}// end writeRandomCodesToFile()
Being respectfull with your code, it will be something like this:
public static void mergeProductCodesToFile(String prefixFile, String inlineFile, String suffixFile, String productFile) throws IOException {
try (BufferedReader prefixReader = new BufferedReader(new FileReader(prefixFile));
BufferedReader inlineReader = new BufferedReader(new FileReader(inlineFile));
BufferedReader suffixReader = new BufferedReader(new FileReader(suffixFile))) {
StringBuilder line = new StringBuilder();
String prefix, inline, suffix;
while ((prefix = prefixReader.readLine()) != null) {
//assuming that nothing fails and the files are equals in # of lines.
inline = inlineReader.readLine();
suffix = suffixReader.readLine();
line.append(prefix).append(inline).append(suffix).append("\r\n");
// write it
...
}
} finally {/*close writers*/}
}
Some exceptions may be thrown.
I hope you don't implement it in one single method.
You can make use of iterators too, or a very simple reader class (method).
I wouldn't use List to load the data at least I guarantee that the files will be low sized and that I can spare the memory usage.
My approach as we discussed by storing the data and interleaving it. Like Sergio said in his answer, make sure memory isn't a problem in terms of the size of the file and how much memory the data structures will use.
//the main method we're working on
public static void mergeProductCodesToFile(String prefixFile,
String inlineFile,
String suffixFile,
String productFile) throws IOException
{
try {
List<String> prefix = read(prefixFile);
List<String> inline = read(inlineFile);
List<String> suffix = read(productFile);
String fileText = interleave(prefix, inline, suffix);
//write the single string to file however you want
} catch (...) {...}//do your error handling...
}
//helper methods and some static variables
private static Scanner reader;//I just prefer scanner. Use whatever you want.
private static StringBuilder sb;
private static List<String> read(String filename) throws IOException
{
List<String> list = new ArrayList<String>;
try (reader = new Scanner(new File(filename)))
{
while(reader.hasNext())
{ list.add(reader.nextLine()); }
} catch (...) {...}//catch errors...
}
//I'm going to build the whole file in one string, but you could also have this method return one line at a time (something like an iterator) and output it to the file to avoid creating the massive string
private static String interleave(List<String> one, List<String> two, List<String> three)
{
sb = new StringBuilder();
for (int i = 0; i < one.size(); i++)//notice no checking on size equality of words or the lists. you might want this
{
sb.append(one.get(i)).append(two.get(i)).append(three.get(i)).append("\n");
}
return sb.toString()
}
Obviously there is still some to be desired in terms of memory and performance; additionally there are ways to make this slightly more extensible to other situations, but it's a good starting point. With c#, I could more easily make use of the iterator to make interleave give you one line at a time, potentially saving memory. Just a different idea!

ArrayList confusion

The code below is my attempt to read from a file of strings, read through each line until a ':' is found then store + print everything after that. however The print function prints out everything that I read in from the file. Can someone spot where I'm going wrong? thanks
edit: every line is in this format "Some text here:More text here"
public void openFile() {
try {
scanner = new BufferedReader(new FileReader("calendar.ics"));
} catch (Exception e) {
System.out.println("Could not open file");
}
}
public void readFile() {
ArrayList<String> vals = new ArrayList<String>();
String test;
try {
while ((line = scanner.readLine()) != null)
{
int indexOfComma = line.indexOf("\\:"); // returns firstIndexOf ':'
test = line.substring(indexOfComma+1); // test to be everything after ':'
vals.add(test); // add values to vals
}
} catch(Exception ex){ }
for(int i=0; i<vals.size(); i++){
System.out.println(vals.get(i));
}
}
You don't need to escape your colon.
line.indexOf("\\:");
Change the above line to: -
line.indexOf(":");
Because, that will search for \\:, and if not found return the value -1.
test = line.substring(indexOfComma+1);
So, if your indexComma is -1, which will certainly be, if your string does not contain - \\:, then your above line becomes: -
line.substring(0); // same as whole string
As a suggestion, you should have abstract type as the type of reference when declaring your list. So, you should use List instead of ArrayList on the LHS of the List declaration: -
List<String> vals = new ArrayList<String>();

Print data from file to array

I need to have this file print to an array, not to screen.And yes, I MUST use an array - School Project - I'm very new to java so any help is appreciated. Any ideas? thanks
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Scanner;
public class HangmanProject
{
public static void main(String[] args) throws FileNotFoundException
{
String scoreKeeper; // to keep track of score
int guessesLeft; // to keep track of guesses remaining
String wordList[]; // array to store words
Scanner keyboard = new Scanner(System.in); // to read user's input
System.out.println("Welcome to Hangman Project!");
// Create a scanner to read the secret words file
Scanner wordScan = null;
try {
wordScan = new Scanner(new BufferedReader(new FileReader("words.txt")));
while (wordScan.hasNext()) {
System.out.println(wordScan.next());
}
} finally {
if (wordScan != null) {
wordScan.close();
}
}
}
}
Nick, you just gave us the final piece of the puzzle. If you know the number of lines you will be reading, you can simply define an array of that length before you read the file
Something like...
String[] wordArray = new String[10];
int index = 0;
String word = null; // word to be read from file...
// Use buffered reader to read each line...
wordArray[index] = word;
index++;
Now that example's not going to mean much to be honest, so I did these two examples
The first one uses the concept suggested by Alex, which allows you to read an unknown number of lines from the file.
The only trip up is if the lines are separated by more the one line feed (ie there is a extra line between words)
public static void readUnknownWords() {
// Reference to the words file
File words = new File("Words.txt");
// Use a StringBuilder to buffer the content as it's read from the file
StringBuilder sb = new StringBuilder(128);
BufferedReader reader = null;
try {
// Create the reader. A File reader would be just as fine in this
// example, but hay ;)
reader = new BufferedReader(new FileReader(words));
// The read buffer to use to read data into
char[] buffer = new char[1024];
int bytesRead = -1;
// Read the file to we get to the end
while ((bytesRead = reader.read(buffer)) != -1) {
// Append the results to the string builder
sb.append(buffer, 0, bytesRead);
}
// Split the string builder into individal words by the line break
String[] wordArray = sb.toString().split("\n");
System.out.println("Read " + wordArray.length + " words");
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (Exception e) {
}
}
}
The second demonstrates how to read the words into an array of known length. This is probably closer to the what you actually want
public static void readKnownWords()
// This is just the same as the previous example, except we
// know in advance the number of lines we will be reading
File words = new File("Words.txt");
BufferedReader reader = null;
try {
// Create the word array of a known quantity
// The quantity value could be defined as a constant
// ie public static final int WORD_COUNT = 10;
String[] wordArray = new String[10];
reader = new BufferedReader(new FileReader(words));
// Instead of reading to a char buffer, we are
// going to take the easy route and read each line
// straight into a String
String text = null;
// The current array index
int index = 0;
// Read the file till we reach the end
// ps- my file had lots more words, so I put a limit
// in the loop to prevent index out of bounds exceptions
while ((text = reader.readLine()) != null && index < 10) {
wordArray[index] = text;
index++;
}
System.out.println("Read " + wordArray.length + " words");
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (Exception e) {
}
}
}
If you find either of these useful, I would appropriate it you would give me a small up-vote and check Alex's answer as correct, as it's his idea that I've adapted.
Now, if you're really paranoid about which line break to use, you can find the values used by the system via the System.getProperties().getProperty("line.separator") value.
Do you need more help with the reading the file, or getting the String to a parsed array? If you can read the file into a String, simply do:
String[] words = readString.split("\n");
That will split the string at each line break, so assuming this is your text file:
Word1
Word2
Word3
words will be: {word1, word2, word3}
If the words you are reading are stored in each line of the file, you can use the hasNextLine() and nextLine() to read the text one line at a time. Using the next() will also work, since you just need to throw one word in the array, but nextLine() is usually always preferred.
As for only using an array, you have two options:
You either declare a large array, the size of whom you are sure will never be less than the total amount of words;
You go through the file twice, the first time you read the amount of elements, then you initialize the array depending on that value and then, go through it a second time while adding the string as you go by.
It is usually recommended to use a dynamic collection such as an ArrayList(). You can then use the toArray() method to turnt he list into an array.

Categories

Resources