Preserving line breaks and spacing in file IO

Preserving line breaks and spacing in file IO - java

I am workig on a pretty neat problem challenge that involves reading words from a .txt file. The program must allow for ANY .txt file to be read, ergo the program cannot predict what words it will be dealing with.
Then, it takes the words and makes them their "Pig Latin" counterpart, and writes them into a new file. There are a lot more requirements to this problem but siffice to say, I have every part solved save one...when printng to the new file I am unable to perserve the line spacing. That is to say, if line 1 has 5 words and then there is a break and line 2 has 3 words and a break...the same must be true for the new file. As it stands now, it all works but all the converted words are all listed one after the other.
I am interested in learning this so I am OK if you all wish to play coy in your answers. Although I have been at this for 9 hours so "semi-coy" will be appreaciated as well :) Please pay close attention to the "while" statements in the code that is where the file IO action is happening. I am wondering if I need to utilize the nextLine() commands from the scanner and then make a string off that...then make substrings off the nextLine() string to convert the words one at a time. The substrings could be splits or tokens, or something else - I am unclear on this part and token attempts are giving me compiler arrors exceptions "java.util.NoSuchElementException" - I do not seem to understand the correct call for a split command. I tried something like String a = scan.nextLine() where "scan" is my scanner var. Then tried String b = a.split() no go. Anyway here is my code and see if you can figure out what I am missing.
Here is code and thank you very much in advance Java gods....
import java.util.*;
import javax.swing.*;
import java.io.*;
import java.text.*;
public class PigLatinTranslator
{
static final String ay = "ay"; // "ay" is added to the end of every word in pig latin
public static void main(String [] args) throws IOException
{
File nonPiggedFile = new File(...);
String nonPiggedFileName = nonPiggedFile.getName();
Scanner scan = new Scanner(nonPiggedFile);
nonPiggedFileName = ...;
File pigLatinFile = new File(nonPiggedFileName + "-pigLatin.txt"); //references a file that may or may not exist yet
pigLatinFile.createNewFile();
FileWriter newPigLatinFile = new FileWriter(nonPiggedFileName + "-pigLatin.txt", true);
PrintWriter PrintToPLF = new PrintWriter(newPigLatinFile);
while (scan.hasNext())
{
boolean next;
while (next = scan.hasNext())
{
String nonPig = scan.next();
nonPig = nonPig.toLowerCase();
StringBuilder PigLatWord = new StringBuilder(nonPig);
PigLatWord.insert(nonPig.length(), nonPig.charAt(0) );
PigLatWord.insert(nonPig.length() + 1, ay);
PigLatWord.deleteCharAt(0);
String plw = PigLatWord.toString();
if (plw.contains("!") )
{
plw = plw.replace("!", "") + "!";
}
if (plw.contains(".") )
{
plw = plw.replace(".", "") + ".";
}
if (plw.contains("?") )
{
plw = plw.replace("?", "") + "?";
}
PrintToPLF.print(plw + " ");
}
PrintToPLF.close();
}
}
}

Use BufferedReader, not Scanner. http://docs.oracle.com/javase/6/docs/api/java/io/BufferedReader.html
I leave that part of it as an exercise for the original poster, it's easy once you know the right class to use! (And hopefully you learn something instead of copy-pasting my code).
Then pass the entire line into functions like this: (note this does not correctly handle quotes as it puts all non-apostrophe punctuation at the end of the word). Also it assumes that punctuation is supposed to go at the end of the word.
private static final String vowels = "AEIOUaeiou";
private static final String punct = ".,!?";
public static String pigifyLine(String oneLine) {
StringBuilder pigified = new StringBuilder();
boolean first = true;
for (String word : oneLine.split(" ")) {
if (!first) pigified.append(" ");
pigified.append(pigify(word));
first = false;
}
return pigified.toString();
}
public static String pigify(String oneWord) {
char[] chars = oneWord.toCharArray();
StringBuilder consonants = new StringBuilder();
StringBuilder newWord = new StringBuilder();
StringBuilder punctuation = new StringBuilder();
boolean consDone = false; // set to true when the first consonant group is done
for (int i = 0; i < chars.length; i++) {
// consonant
if (vowels.indexOf(chars[i]) == -1) {
// punctuation
if (punct.indexOf(chars[i]) > -1) {
punctuation.append(chars[i]);
consDone = true;
} else {
if (!consDone) { // we haven't found the consonants
consonants.append(chars[i]);
} else {
newWord.append(chars[i]);
}
}
} else {
consDone = true;
// vowel
newWord.append(chars[i]);
}
}
if (consonants.length() == 0) {
// vowel words are "about" -> "aboutway"
consonants.append("w");
}
consonants.append("ay");
return newWord.append(consonants).append(punctuation).toString();
}

You could try to store the count of words per line in a separate data structure, and use that as a guide for when to move on to the next line when writing the file.
I purposely made this semi-vague for you, but can elaborate on request.

Related

How do newlines affect System.in.read() in java

I'm trying to make a lexical analyzer class, that mostly tokenizes the input stream characters, and I use System.in.read() to read characters. The doc says that it returns -1 when end of stream is reached, but, how is this behaviour different when it has different input, I cannot understand this. For e.g. delete.txt has the input:
1. I have
2. bulldoz//er
Then the Lexer has correct tokenization as:
[I=257, have=257, false=259, er=257, bulldoz=257, true=258]
but now if I insert some blank lines using enter then, the code goes on an infinite loop, the code checks newlines and spaces for input, yet, how does it get bypassed? :
1. I have
2. bulldoz//er
3.
The full code is:
package lexer;
import java.io.*;
import java.util.*;
import lexer.Token;
import lexer.Num;
import lexer.Tag;
import lexer.Word;
class Lexer{
public int line = 1;
private char null_init = ' ';
private char tab = '\t';
private char newline = '\n';
private char peek = null_init;
private char comment1 = '/';
private char comment2 = '*';
private Hashtable<String, Word> words = new Hashtable<>();
//no-args constructor
public Lexer(){
reserve(new Word(Tag.TRUE, "true"));
reserve(new Word(Tag.FALSE, "false"));
}
void reserve(Word word_obj){
words.put(word_obj.lexeme, word_obj);
}
char read_buf_char() throws IOException {
char x = (char)System.in.read();
return x;
}
/*tokenization done here*/
public Token scan()throws IOException{
for(; ; ){
// while exiting the loop, sometime the comment
// characters are read e.g. in bulldoz//er,
// which is lost if the buffer is read;
// so read the buffer i
peek = read_buf_char();
if(peek == null_init||peek == tab){
peek = read_buf_char();
System.out.println("space is read");
}else if(peek==newline){
peek = read_buf_char();
line +=1;
}
else{
break;
}
}
if(Character.isDigit(peek)){
int v = 0;
do{
v = 10*v+Character.digit(peek, 10);
peek = read_buf_char();
}while(Character.isDigit(peek));
return new Num(v);
}
if(Character.isLetter(peek)){
StringBuffer b = new StringBuffer(32);
do{
b.append(peek);
peek = read_buf_char();
}while(Character.isLetterOrDigit(peek));
String buffer_string = b.toString();
Word reserved_word = (Word)words.get(buffer_string);//returns null if not found
if(reserved_word != null){
return reserved_word;
}
reserved_word = new Word(Tag.ID, buffer_string);
// put key value pair in words hashtble
words.put(buffer_string, reserved_word);
return reserved_word;
}
// if character read is not a digit or a letter,
// then the character read is a new token
Token t = new Token(peek);
peek = ' ';
return t;
}
private char get_peek(){
return (char)this.peek;
}
private boolean reached_buf_end(){
// reached end of buffer
if(this.get_peek() == (char)-1){
return true;
}
return false;
}
public void run_test()throws IOException{
//loop checking variable
//a token object is initialized with dummy value
Token new_token = null;
// while end of stream has not been reached
while(this.get_peek() != (char)-1){
new_token = this.scan();
}
System.out.println(words.entrySet());
}
public static void main(String[] args)throws IOException{
Lexer tokenize = new Lexer();
tokenize.run_test();
}
}
The get_peek function gets the value of peek which has current input buffer character.
The check for if the buffer end is reached is done in the run_test function.
The main processing is done in the scan() function.
I used the following command: cat delete.txt|java lexer/Lexer to provide the file as input to the compiled java class. Please tell me how is it that this code with the input file with newline added is going on an infinite loop?

I am not sure how you are checking for the end of stream (-1). At the end of scan() you are assigning "peek" to space, I think this is messing up when you have a blank line, you are not able to catch -1.

Is there any way to let the program recognize "\n" in text files as line break code?

I've been creating a game in Java for a while and I used to write all the in-game texts directly in my code like this:
String text001 = "You're in the castle.\n\nWhere do you go next?"
But recently I decided to write all the in-game texts in a text file and tried to let the program read them and put them into a String array since the amount of the texts has increased a lot and it made my code incredibly long. The reading went well except one thing. I've inserted line break codes in dialogues and although the code worked properly when I wrote it directly in my code, they are no longer recognized as line break code when I try to read them from a text file.
It is supposed to be displayed as:
You're in the castle.
Where do you go next?
But now it is displayed as:
You're in the castle.\n\nWhere do you go next?
The code doesn't recognize "\n" as line break code any more.
Here's the code :
import java.io.File;
import java.util.Scanner;
import java.util.StringTokenizer;
public class Main {
public static void main(String[] args) {
new Main();
}
public Main() {
Scanner sc;
StringTokenizer token;
String line;
int lineNumber = 1;
String id[] = new String[100];
String text[] = new String[100];
try {
sc = new Scanner(new File("sample.txt"));
while ((line = sc.nextLine()) != null) {
token = new StringTokenizer(line, "|");
while (token.hasMoreTokens()) {
id[lineNumber] = token.nextToken();
text[lineNumber] = token.nextToken();
lineNumber++;
}
}
} catch (Exception e) {
}
System.out.println(text[1]);
String text001 = "You're in the castle.\n\nWhere do you go next?";
System.out.println(text001);
}
}
And this is the content of the text file:
castle|You're in the castle.\n\nWhere do you go next?
inn|You're in the inn. \n\nWhere do you go next?
I would be grateful if anyone tells me how to fix this. Thank you.

Just use
text[lineNumber] = token.nextToken().replace("\\n", "\n");
There is nothing inherently special about \n in a text file. It is just a \, followed by a \n.
It is only in Java (or other languages) which define that this sequence of characters - in a char or string literal - should be interpreted as a 0x0a (ASCII newline) character.
So, you can replace the character sequence with the one you want it to be interpreted as.

Java compare strings from two places and exclude any matches

I'm trying to end up with a results.txt minus any matching items, having successfully compared some string inputs against another .txt file. Been staring at this code for way too long and I can't figure out why it isn't working. New to coding so would appreciate it if I could be steered in the right direction! Maybe I need a different approach? Apologies in advance for any loud tutting noises you may make. Using Java8.
//Sending a String[] into 'searchFile', contains around 8 small strings.
//Example of input: String[]{"name1","name2","name 3", "name 4.zip"}
^ This is my exclusions list.
public static void searchFile(String[] arr, String separator)
{
StringBuilder b = new StringBuilder();
for(int i = 0; i < arr.length; i++)
{
if(i != 0) b.append(separator);
b.append(arr[i]);
String findME = arr[i];
searchInfo(MyApp.getOptionsDir()+File.separator+"file-to-search.txt",findME);
}
}
^This works fine. I'm then sending the results to 'searchInfo' and trying to match and remove any duplicate (complete, not part) strings. This is where I am currently failing. Code runs but doesn't produce my desired output. It often finds part strings rather than complete ones. I think the 'results.txt' file is being overwritten each time...but I'm not sure tbh!
file-to-search.txt contains: "name2","name.zip","name 3.zip","name 4.zip" (text file is just a single line)
public static String searchInfo(String fileName, String findME)
{
StringBuffer sb = new StringBuffer();
try {
BufferedReader br = new BufferedReader(new FileReader(fileName));
String line = null;
while((line = br.readLine()) != null)
{
if(line.startsWith("\""+findME+"\""))
{
sb.append(line);
//tried various replace options with no joy
line = line.replaceFirst(findME+"?,", "");
//then goes off with results to create a txt file
FileHandling.createFile("results.txt",line);
}
}
} catch (Exception e) {
e.printStackTrace();
}
return sb.toString();
}
What i'm trying to end up with is a result file MINUS any matching complete strings (not part strings):
e.g. results.txt to end up with: "name.zip","name 3.zip"

ok with the information I have. What you can do is this
List<String> result = new ArrayList<>();
String content = FileUtils.readFileToString(file, "UTF-8");
for (String s : content.split(", ")) {
if (!s.equals(findME)) { // assuming both have string quotes added already
result.add(s);
}
}
FileUtils.write(newFile, String.join(", ", result), "UTF-8");
using apache commons file utils for ease. You may add or remove spaces after comma as per your need.

Java not detecting file contents

I'm having difficulty figuring out why this isn't working. Java simply isn't executing the while loop, file apparently does not have a next line.
fileName = getFileName(keyboard);
file = new Scanner (new File (fileName));
pass = true;
String currentLine;
while (file.hasNextLine()) {
currentLine = file.nextLine();
System.out.println(reverse(currentLine));
}
Here is the file I am testing this with. I got it to work with the first few paragraphs but it seems to simply stop working...:
Jabberwocky
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.
"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"
He took his vorpal sword in hand:
Long time the manxome foe he soughtó
So rested he by the Tumtum tree,
And stood awhile in thought.
And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!
One, two! One, two! and through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.
"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!"
He chortled in his joy.
'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.
——from Through the Looking-Glass, and What Alice Found There (1872).
/*
* Lab13a.java
*
* A program that prompts the user for an input file name and, if that file exists,
* displays each line of that file in reverse order.
* Used to practice simple File I/O and breaking code up into methods as well as a first
* step to implementing Lab13b.java - reversing the entire file and Lab13c.java writing
* output to a separate output file.
*
* #author Benjamin Meyer
*
*/
package osu.cse1223;
import java.io.*;
import java.util.*;
public class Lab13a {
public static void main(String[] args) {
Scanner keyboard = new Scanner(System.in);
String fileName = "";
Scanner file;
boolean pass = false;
while (!pass) {
try {
fileName = getFileName(keyboard);
file = new Scanner (new File (fileName));
pass = true;
String currentLine;
while (file.hasNextLine()) {
currentLine = file.nextLine();
System.out.println(reverse(currentLine));
}
}
catch (FileNotFoundException e) {
System.out.println("There was a problem reading from " + fileName);
System.out.println("Goodbye.");
return;
}
}
}
// Given a Scanner as input prompts the user to enter a file name. If given an
// empty line, respond with an error message until the user enters a non-empty line.
// Return the string to the calling program. Note that this method should NOT try
// to determine whether the file name is an actual file - it should just get a
// valid string from the user.
private static String getFileName(Scanner inScanner) {
boolean pass = true;
String fileName = "";
while (pass) {
System.out.print("Enter an input name: ");
fileName = inScanner.nextLine();
if (fileName.length()!=0) {
pass = false;
}
else {
System.out.println("You cannot enter an empty string.");
}
}
return fileName;
}
// Given a String as input return the reverse of that String to the calling program.
private static String reverse(String inString) {
if (inString.length()==0) {
return "";
}
String reversed = "" + inString.charAt(inString.length()-1);
for (int x = inString.length()-2; x>=0; x--) {
reversed = reversed + inString.charAt(x);
}
return reversed;
}
}

The issue might lie in your implementation of your functions getFilename() or reverse(). Since you have stated that you got it to work with a few of the paragraphs I doubt that your program is failing due to your file handling. It might be in the logic you are using to reverse the strings in the file that is causing the issue.

How to find certain words in a text file, then find numbers in Java?

I have the following text file (answers.txt):
Problem A: 23|47|32|20
Problem B: 40|50|30|45
Problem C: 5|8|11|14
Problem D: 20|23|25|30
What I need is something that will read the problem that I tell it(Problem A, Problem B), then read the numbers after it, which are separated by the lines, and print it out like this:
Answers for Problem A: a.23 b.47 c.32 d.20
Does anyone know how this can be done? I've been stuck on it for a while.

Read the lines one by one, split the lines at " " first. The you will get an array with three parts "Problem", "A:" and "23|47|32|20". Then split the third part at "|" so you will get a second array with four parts "23,"47","32","20".
Combine all to get the output you want.
If you want info on how to read lines from a file, or spilt strings then there are billions of tutorials online on how to do that so I wont go into detail on how its done. IM sure you can find them.

Check out this code!
It assumes that you have such file format:
Problem A:
23|7|32|20
Problem B:
40|50|30|45
Problem C:
5|8|11|14
Problem D:
20|23|25|30
because you wrote "numbers after it, which are separated by the lines"
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Scanner;
public class Demo {
public static void main(String[] args) throws FileNotFoundException {
Scanner sc = new Scanner(new File("answers.txt"));
List<String> dataList = new ArrayList<String>();
while(sc.hasNextLine()){
dataList.add(sc.nextLine());
}
System.out.println(dataList);
Map<String,String> map = new HashMap<String,String>();
for(int i=0;i<dataList.size();i=i+2){
map.put(dataList.get(i),dataList.get(i+1));
}
for(Entry<String,String> en:map.entrySet()){
System.out.println(en.getKey()+" : "+en.getValue());
}
String problemC = map.get("Problem C:");
String splitted[] = problemC.split("\\|");
System.out.println("Get me problem C: "+String.format("a:%s, b:%s, c:%s, d:%s",splitted[0],splitted[1],splitted[2],splitted[3]));
}
}

Hope this helps!
public static void main(String args[])
{
BufferedReader br = new BufferedReader(new FileReader(new File("answers.txt")));
String lineRead = null;
String problem = "Problem A";//Get this from user Input
List<String> numberData = new ArrayList<String>();
while((lineRead = br.readLine())!=null)
{
if(lineRead.contains(problem))
{
StringTokenizer st = new StringTokenizer(lineRead,":");
String problemPart = st.nextToken();
String numbersPart = st.nextToken();
st = new StringTokenizer(lineRead,"|");
while(st.hasMoreTokens())
{
String number = st.nextToken();
System.out.println("Number is: " + number);
numberData.add(number);
}
break;
}
}
System.out.println("Answers for " + problem + " : " + numberData );
}

Read the lines one by one, split the lines with :. The you will get an array with two parts "Problem A:" and "23|47|32|20". Then split the second part at "|" so you will get a second array with four parts "23,"47","32","20".
Combining all this you will get the output you want.
Cheers!

Use java.util.Scanner and you can filter the integers in the file.
Scanner s = new Scanner (new File ("answers.txt")).useDelimiter("\\s+");
while (s.hasNext()) {
if (s.hasNextInt()) { // check if next token is integer
System.out.print(s.nextInt());
} else {
s.next(); // else read the next token
}
}

Do you know how to read line by line ? If not , chect it How to read a large text file line by line in java?
To sub your string data there have many ways to do. You can sub as you wish. Here for my code..
String data = yourReader.readLine();
String problem = data.substring("Problem".length(), data.indexOf(":"));
System.err.println("Problem is " + problem);
data = data.substring(data.indexOf(":") + 2, data.length());
String[] temp = data.split("\\|");
for (String result : temp) {
System.out.println(result);
}

Assuming there are always four possible answers as in your Example:
// read complete file in fileAsString
String regex = "^(Problem \\w+): (\\d+)\\|(\\d+)\\|(\\d+)\\|(\\d+)$";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(fileAsString);
//and so on, read all the Problems using matcher.find() and matcher.group(int) to get the parts
// put in a Map maybe?
// output the one you want...

I might suggest creating a simple data type for the purpose of organization:
public class ProblemAnswer {
private final String problem;
private final String[] answers;
public ProblemAnswer(String problem, String[] answers) {
this.problem = problem;
this.answers = new String[answers.length];
for (int i = 0; i < answers.length; i++) {
this.answers[i] = answers[i];
}
}
public String getProblem() {
return this.problem;
}
public String[] getAnswers() {
return this.answers;
}
public String getA() {
return this.answers[0];
}
public String getB() {
return this.answers[1];
}
public String getC() {
return this.answers[2];
}
public String getD() {
return this.answers[3];
}
}
Then the reading from the text file would look something like this:
public void read() {
Scanner s = new Scanner("answers.txt");
ArrayList<String> lines = new ArrayList<String>();
while (s.hasNext()) {
lines.add(s.nextLine());//first separate by line
}
ProblemAnswer[] answerKey = new ProblemAnswer[lines.size()];
for (int i = 0; i < lines.size(); i++) {
String[] divide = lines.get(i).split(": "); //0 is the problem name, 1 is the list
//of answers
String[] answers = divide[1].split("|"); //an array of the answers to a given
//question
answerKey[i] = new ProblemAnswer(divide[0], answers); //add a new ProblemAnswer
//object to the key
}
}
Now that leaves you with an answer key with ProblemAnswer objects which is easily checked
with a simple .equals() comparison on the getProblem() method, and whatever index is matched, you have all the answers neatly arranged right within that same object.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Preserving line breaks and spacing in file IO - java

You could try to store the count of words per line in a separate data structure, and use that as a guide for when to move on to the next line when writing the file. I purposely made this semi-vague for you, but can elaborate on request.

Related

How do newlines affect System.in.read() in java

Is there any way to let the program recognize "\n" in text files as line break code?

Java compare strings from two places and exclude any matches

Java not detecting file contents

How to find certain words in a text file, then find numbers in Java?

Categories

Resources