Empty space when reading a file java - java

private Scanner inputFile;
private String corpusFileString = "";
try
{
File file = new File(sourceFile.getText());
inputFile = new Scanner(file);
JOptionPane.showMessageDialog(null, "The file was found.");
if (text != null)
{
translate.setEnabled(true);
}
}
catch (FileNotFoundException ex)
{
JOptionPane.showMessageDialog(null, "The file was not found.");
}
try
{
numberWords = inputFile.nextInt();
}
catch (InputMismatchException ex)
{
JOptionPane.showMessageDialog(null, "The first line on the file must be an integer");
}
while (inputFile.hasNext())
{
corpusFileString = corpusFileString + inputFile.nextLine() + " ";
}
So when I read this file, the first line should be an integer(a different variable will hold that) or it will throw an exception.
The rest of the file should be data(another variable for all the data) but for some reason the String contains an empty space at the beginning and when I split it I have to use +1 in my array cause of that empty space.

The issue is that it's reading the first int, then the rest of the first line.
Basically:
15\n a line here \n another line here
Where \n is a newline.
It reads 15, then it reads to \n, which is "" (omitting the newline character). The rest it reads as you expected.
Try using:
numberWords = Integer.parseInt(inputFile.nextLine());
Instead of
numberWords = inputFile.nextInt();

Use String.Trim() to remove whitespace at the beginning and end of the string.
Link to the method: http://msdn.microsoft.com/de-de/library/t97s7bs3(v=vs.110).aspx

I am not sure about java. Possibly you have to call hasNext() before you try to read with nextInt(). This is how .NET readers and enumerators work. In c# I would write something like
while (reader.MoveNext()) {
string s = reader.Current;
}
In your case you could try
if (inputFile.hasNext()) {
numberWords = inputFile.nextInt();
}

Related

I read file then write the file and the spaces between text disappear

Im reading from a temp file and writing it to a permanent file but somewhere the string loses all its spaces
private void jButton4ActionPerformed(java.awt.event.ActionEvent evt) {
String b, filename;
b = null;
filename = (textfieldb.getText());
try {
// TODO add your handling code here:
dispose();
Scanner scan;
scan = new Scanner(new File("TempSave.txt"));
StringBuilder sb = new StringBuilder();
while (scan.hasNext()) {
sb.append(scan.next());
}
b = sb.toString();
String c;
c = b;
FileWriter fw = null;
try {
fw = new FileWriter(filename + ".txt");
} catch (IOException ex) {
Logger.getLogger(hiudsjh.class.getName()).log(Level.SEVERE, null, ex);
}
PrintWriter pw = new PrintWriter(fw);
pw.print(c);
pw.close();
System.out.println(c);
} catch (FileNotFoundException ex) {
Logger.getLogger(NewJFrame.class.getName()).log(Level.SEVERE, null, ex);
}
dispose();
hiudsjh x = new hiudsjh();
x.setVisible(true);
System.out.println(b);
}
theres no error messages just the output should be a file with the spaces remaining
Instead of hasNext() and next() with which you don't get the spaces, use hasNextLine() and nextLine() to read the filr line by line and append after each line a line separator:
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append(System.lineSeparator());
}
From the Scanner documentation:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
And from the next methods docu
Finds and returns the next complete token from this scanner. A complete token is preceded and followed by input that matches the delimiter pattern.
In other words, the Scanner splits the input String into sequences without whitespaces. To read the file as a String you could use new String(Files.readAllBytes(Paths.get(filePath)), StandardCharsets.UTF_8); to read the entire file.
This:
while (scan.hasNext()) {
sb.append(scan.next());
}
is what is removing the spaces...next() will return the next complete token from the scanner, this does not include spaces. You will need to append spaces or change the way you read the file...
Instead of scannining each token, you can read your file line by line and append a line separator after each line:
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append(System.lineSeparator());
}

How to skip reading a line with scanner

I have read in a text file and am scanning said file. The question I have is how would I skip over lines that include a certain character (in my case lines that start with " // " and " " (whitespace).
Here is my code at the moment. Can someone point me in the right direction?
File dataFile = new File(filename);
Scanner scanner = new Scanner(dataFile);
while(scanner.hasNext())
{
String lineOfText = scanner.nextLine();
if (lineOfText.startsWith("//")) {
System.out.println(); // not sure what to put here
}
System.out.println(lineOfText);
}
scanner.close();
You will only want to execute the code within the while-loop if the line of text doesn't start with a / or whitespace. You can filter these out as seen below:
while(scanner.hasNext()) {
String lineOfText = scanner.nextLine();
if (lineOfText.startsWith("//") || lineOfText.startsWith(" ")) {
continue; //Exit this iteration if line starts with space or /
}
System.out.println(lineOfText);
}
As you are iterating over the lines of text in the file, use String's startsWith() method to check if the line starts with the sequences you are trying to avoid.
If it does, continue to the next line. Otherwise, print it.
while (scanner.hasNext()) {
String lineOfText = scanner.nextLine();
if (lineOfText.startsWith("//") || lineOfText.startsWith(" ") ) {
continue;
}
System.out.println(lineOfText);
}
Just use a continue like -
if (lineOfText.startsWith("//")) {
continue; //would skip the loop to next iteration from here
}
Detials - What is the "continue" keyword and how does it work in Java?
If you're just interested in printing out the lines of code that begin with a "//" then you should just use the continue keyword in java.
String lineOfText = scanner.nextLine();
if (lineOfText.startsWith("//")) {
continue;
}
See this post for more information regarding the "continue" keyword.
You can just insert "else" in your code like:
public static void main(String[] args) throws FileNotFoundException {
File dataFile = new File("testfile.txt");
Scanner scanner = new Scanner(dataFile);
while(scanner.hasNext())
{
String lineOfText = scanner.nextLine();
if (lineOfText.startsWith("//")) {
System.out.println();
}
else
System.out.println(lineOfText);
}
scanner.close();
}
}

Scanner restarting in Java

My task is to read a text file in chunks of 64 characters, and use 2 different processes called Substitution and Column Transposition to encrypt it. Then, I have to decrypt it and write it out to another file.
I have written and tested out both processes of encrypting and decrypting and it worked wonderfully. But then I tried to loop the processes in case more than 64 characters were in the input file.
As a test case, I tried a 128 character input file. Unfortunately, the result only gives me the first 64 characters twice. I've tracked the scanner position and it goes beyond 64, but the characters read start back from 0. I'm not sure what the problem is.
Here is the relevant part of my code:
public static void main(String[] args) {
//Declare variables
Scanner console = new Scanner(System.in);
String inputFileName = null;
File inputFile = null;
Scanner in = null;
do
{
//Check if there are enough arguments
try
{
inputFileName = args[1];
}
catch (IndexOutOfBoundsException exception)
{
System.out.println("Not enough arguments.");
System.exit(1);
}
catch (Exception exception)
{
System.out.println("There was an error. Please try again.");
System.exit(1);
}
//Check if Input File is valid
try
{
inputFile = new File(inputFileName);
in = new Scanner(inputFile);
outputFile = new File(outputFileName);
out = new Scanner(outputFile);
}
catch (FileNotFoundException exception)
{
System.out.println("Could not find input file.");
System.exit(1);
}
catch (Exception exception)
{
System.out.println("There was an error. Please try again.");
System.exit(1);
}
} while (outputFileName != null && !inputFile.exists());
//Encryption
//Prepare patterns
String subPattern = CreateSubstitutionPattern(hash);
int[] transPattern = CreateTranspositionPattern(hash);
//Apply patterns
String textContent = "";
String applySub = "";
String applyTrans = "";
do
{
textContent = Read64Chars(in);
applySub = applySub + ApplySubstitutionPattern(textContent, subPattern);
applyTrans = applyTrans + ApplyTranspositionPattern(applySub, transPattern);
} while (in.hasNext());
//Decryption
String encryptContent = "";
Scanner encrypt = new Scanner(applyTrans);
String removeTrans = "";
String removeSub = "";
do
{
encryptContent = Read64Chars(encrypt);
System.out.println(applyTrans);
removeTrans = removeTrans + RemoveTranspositionPattern(encryptContent, transPattern);
removeSub = removeSub + RemoveSubstitutionPattern(removeTrans, subPattern);
} while (encrypt.hasNext());
console.close();
in.close();
encrypt.close();
System.out.println(removeSub); //For temporary testing
}
public static String Read64Chars (Scanner in)
{
String textContent = "";
in.useDelimiter("");
for (int x=0; x<64; x++)
{
if (in.hasNext())
{
textContent = textContent + in.next().charAt(0);
}
}
return textContent;
}
Do note that I have more variables to fill in args[0] and args[2] but I removed them for simplicity.
I would like to know if it is true that once a scanner reads a portion of it's input, it "consumes" it, and that portion gets removed. Does the scanner reset itself when declared again through a method? For example, does the declaration only point to the input source of the original scanner, or the actual scanner with its current properties?
encrypt is a diffrent Scanner from in, which you advance by 64 characters when you first call Read64Chars. So, encrypt starts at the first character when you call Read64Chars(encrypt). It seems like you want to use the same Scanner both times.
Also, in the future please name your functions starting with a lowercase letter. I felt dirty typing that... :)
A proper solution to get the whole encrypted text would be a code like this
public static String encryptedTextFile (Scanner in)
{
//ArrayList<String> stringBlocksOf64Chars = new ArrayList<String>();
StringBuilder encryptedTxt = new StringBuilder();
String currentTxt = "";
while (in.hasNextLine()) {
String line = currentTxt + in.nextLine();
currentTxt = "";
int i = 0;
for( ; i < line.length()/64 ; i++){
currentTxt = line.substring(i * 64, (i+1)*64);
//TODO - encrypt the text before adding it to the list
encryptedTxt.append(currentTxt);//encryptedTxt.append(encrypt(currentTxt));
}
currentTxt = line.substring(i * 64, line.length());
}
encryptedTxt.append(currentTxt);
/*for(String str : stringBlocksOf64Chars)
System.out.println(str);*/
return encryptedTxt.toString();
}
Your loop for (int x=0; x<64; x++) makes sure that you read only first 64 characters always and not the complete file. To get around that you should actually read whole file line by line.
The above code block follows this idea.
Steps to break down the logic.
Read the file line by line using scanner.
Break each line into chunks of 64 characters and encrypt the block 64 characters at a time
Generate encrypted text adding the encrypted 64 characters.
Whatever you do first break down the logic/steps you want to use in your code to make it simpler to understand or code.
Break the lines into 64 characters

Read text from file and correct it (commas and dots)[Java]

I have to correct text in the file.
When is comma or dot I have to change to the correct position e.g.
"Here is ,some text , please correct. this text. " to "Here is, some text, please correct. this text."
I noticed that my code is not work properly. For dots he does not work at all, for commas before adds comma make space.Do you have any hints?
FileReader fr = null;
String line = "";
String result="";
String []array;
String []array2;
String result2="";
// open the file
try {
fr = new FileReader("file.txt");
} catch (FileNotFoundException e) {
System.out.println("Can not open the file!");
System.exit(1);
}
BufferedReader bfr = new BufferedReader(fr);
// read the lines:
try {
while((line = bfr.readLine()) != null){
array=line.split(",");
for(int i=0;i<array.length;i++){
//if i not equal to end(at the end has to be period)
if(i!=array.length-1){
array[i]+=",";
}
result+=array[i];
}
// System.out.println(result);
array2=result.split("\\.");
for(int i=0;i<array2.length;i++){
System.out.println(array2[i]);
array[i]+="\\.";
result2+=array2[i];
}
System.out.println(result2);
}
} catch (IOException e) {
System.out.println("Can not read the file!");
System.exit(2);
}
// close the file
try {
fr.close();
} catch (IOException e) {
System.out.println("error can not close the file");
System.exit(3);
}
Let's first assume you can use regex. Here is a simple way to do what you want:
import java.io.*;
class CorrectFile
{
public static void main(String[] args)
{
FileReader fr = null;
String line = "";
String result="";
// open the file
try {
fr = new FileReader("file.txt");
} catch (FileNotFoundException e) {
System.out.println("Can not open the file!");
System.exit(1);
}
BufferedReader bfr = new BufferedReader(fr);
// read the lines:
try {
while((line = bfr.readLine()) != null){
line = line.trim().replaceAll("\\s*([,,.])\\s*", "$1 ");
System.out.println(line);
}
} catch (IOException e) {
System.out.println("Can not read the file!");
System.exit(2);
}
// close the file
try {
fr.close();
} catch (IOException e) {
System.out.println("error can not close the file");
System.exit(3);
}
}
}
The most import thing is this line: line = line.trim().replaceAll("\\s*([,,.])\\s*", "$1 ");. First, each line you read may contain white spaces at both ends. String.trim() will remove them if so. Next, taken the string (with white spaces at both ends removed), we want to replace something like "a number of white spaces + a comma + a number of spaces" with "a comma + a white space) and the same thing with "dot". "\s" is regex for space and "\s*" is regex for "zero or any number of space". "[]" represents character group and "[,,.]" means either a "," or a "." and the middle comma just a a separator. Here we need to escape "\" for String, so now we have "\s*([,,.])\s*" which means let's replace some arbitrary number of white spaces followed by either a "," or a ".", followed by arbitrary number of white spaces whit either a "," followed by one space or a "." followed by one space. The brackets here makes the elements inside it a capture group which serves the purpose of "saving" the match found (here either a "," or a ".") and we use it later in our example as "$1". So we will be able to replace the matches we found with either a "," or a "." whatever the match is. Since you need a space after comma or dot, we add a white space there make it "$1 ".
Now, let's see what's wrong with your original thing and why I said String.split() may not be a good idea.
Aside from you are creating tons of new String objects, the most obvious problem is you (might be out of a typo) used array[i]+="."; instead of array2[i]+=".";. But the most not so obvious problem is right coming from the String.split() method which actually includes white spaces for the String segments in your split arrays. The last array element even contains only a white space.

Read next word in java

I have a text file that has following content:
ac und
accipio annehmen
ad zu
adeo hinzugehen
...
I read the text file and iterate through the lines:
Scanner sc = new Scanner(new File("translate.txt"));
while(sc.hasNext()){
String line = sc.nextLine();
}
Each line has two words. Is there any method in java to get the next word or do I have to split the line string to get the words?
You do not necessarily have to split the line because java.util.Scanner's default delimiter is whitespace.
You can just create a new Scanner object within your while statement.
Scanner sc2 = null;
try {
sc2 = new Scanner(new File("translate.txt"));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while (sc2.hasNextLine()) {
Scanner s2 = new Scanner(sc2.nextLine());
while (s2.hasNext()) {
String s = s2.next();
System.out.println(s);
}
}
You already get the next line in this line of your code:
String line = sc.nextLine();
To get the words of a line, I would recommend to use:
String[] words = line.split(" ");
Using Scanners, you will end up spawning a lot of objects for every line. You will generate a decent amount of garbage for the GC with large files. Also, it is nearly three times slower than using split().
On the other hand, If you split by space (line.split(" ")), the code will fail if you try to read a file with a different whitespace delimiter. If split() expects you to write a regular expression, and it does matching anyway, use split("\\s") instead, that matches a "bit" more whitespace than just a space character.
P.S.: Sorry, I don't have right to comment on already given answers.
you're better off reading a line and then doing a split.
File file = new File("path/to/file");
String words[]; // I miss C
String line;
HashMap<String, String> hm = new HashMap<>();
try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8")))
{
while((line = br.readLine() != null)){
words = line.split("\\s");
if (hm.containsKey(words[0])){
System.out.println("Found duplicate ... handle logic");
}
hm.put(words[0],words[1]); //if index==0 is ur key
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
You can just use Scanner to read word by word, Scanner.next() reads the next word
try {
Scanner s = new Scanner(new File(filename));
while (s.hasNext()) {
System.out.println("word:" + s.next());
}
} catch (IOException e) {
System.out.println("Error accessing input file!");
}

Categories

Resources