I try to read a File char by char. Unfortunately Java ignores EOF while reading chars from file.
FileReader fileReader = new FileReader(fileText);
char c;
String word = "";
List<String> words = new ArrayList<String>();
while ((c = (char) fileReader.read()) != -1) {
System.out.println(c);
if (c != ' ') {
word = word + c;
}
else {
words.add(word + " ");
word = "";
}
}
It should break up after the file is read, but instead it never stops running....
In Java, char is unsigned and cannot equal -1. You should do the comparison before you do the cast.
int ch;
while ((ch = fileReader.read()) != -1) {
char c = (char)ch;
System.out.println(c);
...
}
This happens because char cannot be equal to -1, even if you assign -1 to it:
char c = (char)-1;
System.out.println(c == -1); // prints false
Make c an int, and cast it to char only when you concatenate:
word = word + (char)c;
Better yet, use StringBuilder to build strings at runtime: otherwise, you create lots of temporary string objects in a loop, and these objects get thrown away.
StringBuilder word = new StringBuilder();
List<String> words = new ArrayList<String>();
int c;
while ((c = fileReader.read()) != -1) {
System.out.println((char)c);
word.append((char)c);
if (c == ' ') {
words.add(word.toString());
word = new StringBuilder();
}
}
You should try the below code
public static void main(String[] args) throws IOException {
FileReader fileReader = new FileReader(fileLocation);
int c;
String word = "";
List<String> words = new ArrayList<String>();
while ((c = (int) fileReader.read()) != -1) {
System.out.println((char)c);
char ch = (char)c;
if (ch != ' ') {
word = word + ch;
} else {
words.add(word + " ");
word = "";
}
}
System.out.println(word);
}
Related
I have this program and I need it to count the lower and uppercase A's in a data file. I'm not sure what to use between charAt or substring. It's also all in a while loop, and I was getting at the fact that maybe I need to use the next() method? Maybe? I just need to find these characters and count them up in total.
import static java.lang.System.*;
import java.util.*;
import java.io.*;
public class Java2305{
public static void main(String args[]){
new Solution();
}}
class Solution
{
private Scanner fileScan;
Solution()
{
run();
}
void run()
{
int count = 0;
try
{
fileScan = new Scanner(new File("letters01.dat"));
while(fileScan.hasNext() )
{
String getA = fileScan.substring("A");
out.println(getA);
count++;
}
}
catch(Exception e){}
out.println();
out.println("The letter 'A' occurs "+count+" times.");
out.println();
out.println();
}
}
Why are you using Scanner? That is meant for scanning text for delimited tokens using regular expressions, but you are not really using that.
I suggest you use a Reader instead, then you can call its read() method to read individual characters:
int count = 0;
try
{
Reader fileReader = new FileReader("letters01.dat");
/* or:
Reader fileReader = new InputStreamReader(
new FileInputStream("letters01.dat"),
"the file's charset here"
);
*/
int value = fileReader.read();
while (value != -1)
{
char ch = (char) value;
if ((ch == 'a') || (ch == 'A'))
count++;
value = fileReader.read();
}
}
catch(Exception e){}
You can use a BufferedReader to read the file more efficiently:
Reader fileReader = new BufferedReader(new FileReader("letters01.dat"));
/* or:
Reader fileReader = new BufferedReader(
new InputStreamReader(
new FileInputStream("letters01.dat"),
"the file's charset here"
)
);
*/
And then optionally process it line-by-line instead of char-by-char (though you can still do that, too):
int count = 0;
try
{
String line = fileReader.readLine();
while (line != null)
{
for(int i = 0; i < line.length(); ++i)
{
char ch = line.charAt(i);
if ((ch == 'a') || (ch == 'A'))
count++;
}
line = fileReader.readLine();
}
}
catch(Exception e){}
I'm trying to encrypt a txt file, but when i send my chars to array I lose my spaces. I want to keep my spaces along with punctuation and cases of letters. I am so close but cannot seem to do anything that doesn't make A.) everything a null character or B.) loop capital letters. Thanks in advance.
public class Encryption {
CaesarCipher c= new CaesarCipher();
Scanner kb = new Scanner(System.in);
String end = "";
public void changeLetters(File file) {
System.out.println("How far would you like to shift?");
int shift = Integer.parseInt(kb.nextLine());
Scanner fileScanner;
try {
fileScanner = new Scanner(file);
while (fileScanner.hasNextLine()) {
String line = fileScanner.nextLine();
shift(line, shift);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
private void shift(String line, int shift) {
char[] og = line.toCharArray();
for (int i = 0; i < og.length; i++) {
char letter = og[i];
letter = (char) (letter + shift);
if (letter > 'z') {
letter = (char) (letter - 26);
} else if (letter < 'a') {
letter = (char) (letter + 26);
}
end = end + Character.toString(letter);
}
System.out.println(end);
File file = new File("Encrypted.txt");
FileWriter writer = null;
{
try {
writer = new FileWriter(file);
writer.write(end);
writer.close();
} catch (
IOException e)
{
e.printStackTrace();
}
System.out.println("Decryption Complete");
System.out.println("Q to quit, C to Continue");
String response = kb.next();
if (response.equals("q") || response.equals("Q")) {
System.out.println("Goodbye");
} else if (response.equals("c") || response.equals("C")) {
c.getInformation();
}
}
}
I believe the problem comes from the fact you are adding (+/-) 26 to your letter, for example letter = (char) (letter - 26);. This would only work within the alphabet [a-z]. However as you want to be able to handle capital letters, special characters and such you can't do this.
It would also be cleaner to use the modulo operator % in order to do this. Hence you won't have to make an explicit test, like you did if (letter > 'z').
Here is the shift procedure, which is really simple
private String shift(String str, int shift) {
String shifted = "";
for(int i = 0; i < str.length(); i++) {
char original = str.charAt(i);
char shiftedChar = (char) ((original + shift) % Integer.MAX_VALUE);
shifted += shiftedChar; // Append shifted character to the end of the string
}
return shifted;
}
However i'm not sure this is the modulus to use. But i did some tests and this seemed to work.
Here is how you can shift and unshift
String test = "This is a test!";
String encoded = shift(test, 3);
String decoded = shift(encoded, -3);
System.out.println("Encoded : " + encoded + "\n" + "Decoded : " + decoded);
I am trying to have a program that normalizes my text, it removes multiple empty spaces, it prints the other characters from the original file, and also put spaces and start and ending symbols.
So the conversion, after I write the txt file and open it, I see this content:
numa situaã § ã £ o de emergãªncia mã © dica
as you can see there are some weird characters that I don't want, maybe it's because of Encoding??
This is a text in my language, Portuguese.
Here is my code, how can I fix it?
public static void main(String[] args) throws IOException {
Charset encoding = Charset.defaultCharset();
InputStream in = new FileInputStream(new File("data.txt"));
Reader reader = new InputStreamReader(in, encoding);
Reader buffer = new BufferedReader(reader);
StringBuilder normalizedLanguage = new StringBuilder("<");
int r;
while ((r = buffer.read()) != -1) {
char ch = (char) r;
boolean newline = false;
boolean hasLetterBefore = false;
boolean hasLetterAfter = false;
char symbol = '-';
int lines = 0;
if (newline)
{
normalizedLanguage.append("\n<");
}
if (ch == '\r' || ch == '\n' )
{
lines++;
normalizedLanguage.append(">");
newline = true;
hasLetterBefore = false;
}
else if (Character.isLetterOrDigit(ch))
{
if (hasLetterBefore == true)
{
normalizedLanguage.append(Character.toString(symbol) + Character.toString(Character.toLowerCase(ch)));
}else{
normalizedLanguage.append(Character.toString(Character.toLowerCase(ch)));
}
newline = false;
hasLetterBefore = true;
}
else if (ch == ' ')
{
normalizedLanguage.append(Character.toString(ch));
newline = false;
hasLetterBefore = false;
}
else if (ch == '\t')
{
System.out.println("Tab detected: " + ch);
newline = false;
hasLetterBefore = false;
}
else
{
//Símbolos, entre outros..
if (!hasLetterBefore)
{
normalizedLanguage.append(" " + Character.toString(ch) + " ");
}
else
{
symbol = ch;
}
newline = false;
}
}
String normalizedLanguageString = normalizedLanguage.toString().trim().replaceAll(" +", " ");
PrintWriter out = new PrintWriter("data_after.txt");
out.println(normalizedLanguageString);
out.close();
buffer.close();
reader.close();
in.close();
}
Thank you very much in advance ;)
The problem got solved using another Charset Encoding :)
Change this line:
Charset encoding = Charset.defaultCharset();
To:
Charset encoding = Charset.forName("UTF8");
Thank you very much anyways
I have this piece of code
private String getMessage(BufferedReader in) throws IOException{
StringBuilder sb = new StringBuilder();
while(true){
char pom = (char) in.read();
sb.append(pom);
if (sb.toString().contains("\r\n")) {
String result = sb.toString();
result = result.replace("\r\n", "");
return result;
}
}
}
I want user to write on console some message. And when he writes '\r\n' console should end its input. But this doesn't work... Don't you have some tips what could be the problem?
And in aditional, i don't want to use in.close(); coz i will need this input later.
Try this:
private String getMessage(BufferedReader in) throws IOException{
return in.readLine();
}
OK, if you want to do it byte by byte, then you can try this:
private String getMessage(BufferedReader in) throws IOException{
StringBuilder sb = new StringBuilder();
char prev = (char) 0;
char cur = in.read();
while (prev != '\r' || cur != '\n') {
sb.append(cur);
prev = cur;
cur = in.read();
}
return sb.toString();
}
I've noticed that Java String will reuse char array inside it to avoid creating new char array for a new String instance in method such as subString(). There are several unpublish constructors in String for this purpose, accepting a char array and two int as range to construct a String instance.
But until today I found that split will also reuse the char arr of original String instance. Now I read a loooooong line from a file, split it with "," and cut a very limit column for real usage. Because every part of the line secretly holding the reference of the looooong char array, I got an OOO very soon.
here is example code:
ArrayList<String> test = new ArrayList<String>(3000000);
BufferedReader origReader = new BufferedReader(new FileReader(new File(
"G:\\filewithlongline.txt")));
String line = origReader.readLine();
int i = 0;
while ((line = origReader.readLine()) != null) {
String name = line.split(',')[0];
test.add(name);
i++;
if (i % 100000 == 0) {
System.out.println(name);
}
}
System.out.println(test.size());
Is there any standard method in JDK to make sure that every String instance that spitted is a "real deep copy" not "shallow copy"?
Now I am using a very ugly workaround to force creating a new String instance:
ArrayList<String> test = new ArrayList<String>(3000000);
BufferedReader origReader = new BufferedReader(new FileReader(new File(
"G:\\filewithlongline.txt")));
String line = origReader.readLine();
int i = 0;
while ((line = origReader.readLine()) != null) {
String name = line.split(',')[0]+" ".trim(); // force creating a String instance
test.add(name);
i++;
if (i % 100000 == 0) {
System.out.println(name);
}
}
System.out.println(test.size());
The simplest approach is to create a new String directly. This is one of the rare cases where its a good idea.
String name = new String(line.split(",")[0]); // note the use of ","
An alternative is to parse the file yourself.
do {
StringBuilder name = new StringBuilder();
int ch;
while((ch = origReader.read()) >= 0 && ch != ',' && ch >= ' ') {
name.append((char) ch);
}
test.add(name.toString());
} while(origReader.readLine() != null);
String has a copy constructor you can use for this purpose.
final String name = new String(line.substring(0, line.indexOf(',')));
... or, as Peter suggested, just only read until the ,.
final StringBuilder buf = new StringBuilder();
do {
int ch;
while ((ch = origReader.read()) >= 0 && ch != ',') {
buf.append((char) ch);
}
test.add(buf.toString());
buf.setLength(0);
} while (origReader.readLine() != null);