Java replace characters in a TextFile - Alice In Wonderland

Java replace characters in a TextFile - Alice In Wonderland - java

I'm trying to make a compressor for TextFiles and I get stuck at replacing characters.
This is my code:
compress.setOnAction(event ->
{
String line;
try(BufferedReader reader = new BufferedReader(new FileReader(newFile)))
{
while ((line = reader.readLine()) != null)
{
int length = line.length();
String newLine = "";
for (int i = 1; i < length; i++)
{
int c = line.charAt(i);
if (c == line.charAt(i - 1))
{
}
}
}
}
catch (IOException ex)
{
ex.printStackTrace();
}
});
So what I want to do is: I want to find all the words where two characters are equal, if they are aside (Like 'Took'). When the if statement is true, I want to replace the first letter of the two equals characters, so it would look like: 'T2ok'.
I've tried a lot of things and I get an ArrayOutOfbounds, StringOutOfbounds, and so on, all the time...
Hope someone has a great answer :-)
Regards

Create a method that compress one String as follows:
Loop throu every character using a while loop. Count the duplicates in another nested while loop that increments the current index while duplicates are found and skips them from being written to output. Additionally this counts their occurence.
public String compress(String input){
int length = input.length(); // length of input
int ix = 0; // actual index in input
char c; // actual read character
int ccounter; // occurrence counter of actual character
StringBuilder output = // the output
new StringBuilder(length);
// loop over every character in input
while(ix < length){
// read character at actual index then inc index
c = input.charAt(ix++);
// we count one occurrence of this character here
ccounter = 1;
// while not reached end of line and next character
// is the same as previously read
while(ix < length && input.charAt(ix) == c){
// inc index means skip this character
ix++;
// and inc character occurence counter
ccounter++;
}
// if more than one character occurence is counted
if(ccounter > 1){
// print the character count
output.append(ccounter);
}
// print the actual character
output.append(c);
}
// return the full compressed output
return output.toString();
}
Now you can use this method to create a file input to output stream using java8 techniques.
// create input stream that reads line by line, create output writer
try (Stream<String> input = Files.lines(Paths.get("input.txt"));
PrintWriter output = new PrintWriter("output.txt", "UTF-8")){
// compress each input stream line, and print to output
input.map(s -> compress(s)).forEachOrdered(output::println);
} catch (IOException e) {
e.printStackTrace();
}
If you really want to. You can remove the input file and rename the output file afterwards with
Files.move(Paths.get("output.txt"), Paths.get("input.txt"),StandardCopyOption.REPLACE_EXISTING);
I think this is the most efficient way to do what you want.

try this:
StringBuilder sb = new StringBuilder();
String line;
try(BufferedReader reader = new BufferedReader(new FileReader(newFile)))
{
while ((line = reader.readLine()) != null)
{
if (!line.isEmpty()) {
//clear states
boolean matchedPreviously = false;
char last = line.charAt(0);
sb.setLength(0);
sb.append(last);
for (int i = 1; i < line.length(); i++) {
char c = line.charAt(i);
if (!matchedPreviously && c == last) {
sb.setLength(sb.length()-1);
sb.append(2);
matchedPreviously = true;
} else matchedPreviously = false;
sb.append(last = c);
}
System.out.println(sb.toString());
}
}
}
catch (IOException ex)
{
ex.printStackTrace();
}
This solution uses only a single loop, but can only find occurrences of length 2

Related

Retrieve number of lines in file from JFileChooser Java

Is there a way in Java to know the number of lines of a file chosen?
The method chooser.getSelectedFile().length() is the only method I've seen so far but I can't find out how to find the number of lines in a file (or even the number of characters)
Any help is appreciated, thank you.
--update--
long totLength = fc.getSelectedFile().length(); // total bytes = 284
double percentuale = 100.0 / totLength; // 0.352112676056338
int read = 0;
String line = br.readLine();
read += line.length();
Object[] s = new Object[4];
while ((line = br.readLine()) != null)
{
s[0] = line;
read += line.length();
line = br.readLine();
s[1] = line;
read += line.length();
line = br.readLine();
s[2] = line;
read += line.length();
line = br.readLine();
s[3] = line;
read += line.length();
}
this is what I tried, but the number of the variable read at the end is < of the totLength and I don't know what File.length() returns in bytes other than the content of the file.. As you can see, here i'm trying to read characters though.

Down and dirty:
long count = Files.lines(Paths.get(chooser.getSelectedFile())).count();
You may find this little method handy. It gives you the option to ignore counting blank lines in a file:
public long fileLinesCount(final String filePath, boolean... ignoreBlankLines) {
boolean ignoreBlanks = false;
long count = 0;
if (ignoreBlankLines.length > 0) {
ignoreBlanks = ignoreBlankLines[0];
}
try {
if (ignoreBlanks) {
count = Files.lines(Paths.get(filePath)).filter(line -> line.length() > 0).count();
}
else {
count = Files.lines(Paths.get(filePath)).count();
}
}
catch (IOException ex) {
ex.printStackTrace();
}
return count;
}

You could use the JFileChooser to select a file, than open the file using a file reader and as you iterate over the file just increment a counter, like this...
while (file.hasNextLine()) {
count++;
file.nextLine();
}

How to continue processing a file when I reach a null String [duplicate]

This question already has an answer here:
What is a StringIndexOutOfBoundsException? How can I fix it?
(1 answer)
Closed 4 years ago.
I'm trying to read in a file that contains a sequence of DNA. And within my program I want to read in each subsequence of that DNA of length 4, and store it in my hashmap to count the occurence of each subsequence. For example if I have the sequence CCACACCACACCCACACACCCAC, and I want every subsequence of length 4, the first 3 subsequences would be:
CCAC, CACA, ACAC, etc.
So in order to do this I have to iterate over the string several times, here is my implementation
try
{
String file = sc.nextLine();
BufferedReader reader = new BufferedReader(new FileReader(file + ".fasta"));
Map<String, Integer> frequency = new HashMap<>();
String line = reader.readLine();
while(line != null)
{
System.out.println("Processing Line: " + line);
String [] kmer = line.split("");
for(String nucleotide : kmer)
{
System.out.print(nucleotide);
int sequence = nucleotide.length();
for(int i = 0; i < sequence; i++)
{
String subsequence = nucleotide.substring(i, i+5);
if(frequency.containsKey(subsequence))
{
frequency.put(subsequence, frequency.get(subsequence) +1);
}
else
{
frequency.put(subsequence, 1);
}
}
}
System.out.println();
line = reader.readLine();
}
System.out.println(frequency);
}
catch(StringIndexOutOfBoundsException e)
{
System.out.println();
}
I have a problem when reaching the end of the string, it won't continue to process due to the error. How would I go about getting around that?

You are calling substring(i, i + 5). At the end of the string i + 5 goes out of bounds. Let's say your string is "ABCDEFGH", length 8, your loop will go from i = 0 to i = 7. When i reaches 4 substring(4, 9) cannot be computed and the exception is raised.
Try this:
for(int i = 0; i < sequence - 4; i++)

You can directly read each line and extract first 4 sub-chars without
the need to splitting it up each time when you read a line.
The error you are getting because when the Program is looping through the splitted characters then it is possible that there are less than 4 characters left altogether at the end to be extracted. Less than 4 chars are responsible which is throwing the error. e.g. suppose you have a line CCACACC then grouping in 4 chars you would get 1st group as complete i.e., CCAC and 2nd group as ACC which is incomplete. So in your code when the line nucleotide.substring(i, i+5); is encountered then probably there is no group of complete 4 characters left at the end that can be extracted and hence the Program throws error. And to extract 4 chars you need to add 4, not 5.
So the work around the code will be to put the extraction line in a try block as given below in the edited code. Replace the loop body with the below code.
while(reader.hasNextLine())
{
line = reader.nextLine();
for(int i = 0; i < line.length; i++)
{
String subsequence = "";
// put the extract operation in a try block
// to avoid crashing
try
{
subsequence = nucleotide.substring(i, i+4);
}
catch(Exception e)
{
// just leave blank to pass the error
}
if(frequency.containsKey(subsequence))
{
frequency.put(subsequence, frequency.get(subsequence) +1);
}
else
{
frequency.put(subsequence, 1);
}
}

Based on the title of your post...try changing the condition for your while loop. Instead of using the current:
String line = reader.readLine();
while(line != null) {
// ...... your code .....
}
use this code:
String line;
while((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
// ...... your code .....
}
That would cover handling blank file lines.
Now about the StringIndexOutOfBoundsException exception you are experiencing. I believe by now you already basically know why you are receiving this exception and therefore you need to decide what you want to do about it. When a string is to be split into specific length chunks and that length is not equally divisible against the overall length if a specific file line characters then there are obviously a few options available:
Ignore the remaining characters at the end of the file line. Although an easy solution it's not very feasible since it would produce incomplete data. I don't know anything about DNA but I'm certain this would not be the route to take.
Add the remaining DNA sequence (even though it's short) to the Map. Again, I know nothing about DNA and I'm not sure if even this wouldn't be a viable solution. Perhaps it is, I simply don't know.
Add the remaining short DNA sequence to the beginning of the next
incoming file line and carry on breaking that line into 4 character
chunks. Continue doing this until the end of file is reached at which
point if the final DNA sequence is determined to be short then add
that to the Map (or not).
There may of course be other options and whatever they might be it's something you will need to decide. To assist you however, here is code to cover the three options I've mentioned:
Ignore the remaining characters:
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
}
catch (IOException ex) {
ex.printStackTrace();
}
Add the remaining DNA sequence (even though it's short) to the Map:
Map<String, Integer> frequency = new HashMap<>();
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
String lineRemaining = "";
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
lineRemaining = line.substring(i);
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
if (lineRemaining.length() > 0) {
subsequence = lineRemaining;
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
}
catch (IOException ex) {
ex.printStackTrace();
}
Add the remaining short DNA sequence to the beginning of the next incoming file line:
Map<String, Integer> frequency = new HashMap<>();
String lineRemaining = "";
String subsequence;
String line;
try (BufferedReader reader = new BufferedReader(new FileReader("DNA.txt"))) {
while ((line = reader.readLine()) != null) {
// If file line is blank then skip to next file line.
if (line.trim().equals("")) {
continue;
}
// Add remaining portion of last line to new line.
if (lineRemaining.length() > 0) {
line = lineRemaining + line;
lineRemaining = "";
}
for (int i = 0; i < line.length(); i += 4) {
// Get out of loop - Don't want to deal with remaining Chars
if ((i + 4) > (line.length() - 1)) {
lineRemaining = line.substring(i);
break;
}
subsequence = line.substring(i, i + 4);
if (frequency.containsKey(subsequence)) {
frequency.put(subsequence, frequency.get(subsequence) + 1);
}
else {
frequency.put(subsequence, 1);
}
}
}
// If any Chars remaining at end of file then
// add to MAP
if (lineRemaining.length() > 0) {
frequency.put(lineRemaining, 1);
}
}
catch (IOException ex) {
ex.printStackTrace();
}

It is not clear at all from the question description, but I'll guess your input file ends with an empty line.
Try removing the last newline in your input file, or alternatively check against empty in your while loop:
while (line != null && !line.isEmpty())

Search for appearances of string inside text

I have a .txt file with some text in it.
For example Hello, world.
I'd like to search the whole file and find out how many appearances a string has as well as the position of them, For example "wo" on the above text has one. That number should be placed in an edittext. However I only know how to search a specific char and not whole text, can you please help me? Thanks a lot
BufferedReader reader = new BufferedReader(new FileReader("somefile.txt"));
int ch;
char charToSearch='a';
int counter=0;
while((ch=reader.read()) != -1) {
if(charToSearch == (char)ch) {
counter++;
}
};
reader.close();
System.out.println(counter);

public static int countWord(String word, FileInputStream fis) {
BufferedReader in = new BufferedReader(new InputStreamReader(fis));
String readLine = "";
int count = 0;
try {
while ((readLine = in.readLine()) != null) {
String[] words = readLine.split(" ");
for (String s : words) {
if (s.contains(word))
count++;
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return count;
}

You can use something like:
int nFound = 0;
String target = ".............Your long text..................";
String search = "find this"
int startIndex = 0
do
{
int index = target.indexOf(search, startIndex);
if(index !=-1)
{
// Found
nFound++;
// Here you have the index variable, which says you the position of the found match
/* DO your job */
/* Update the index to start the search again on the rest of the string, until no matches are found*/
startIndex = index+1;
}
else
break;
}while(true);
Before doing this, concatenate the whole text in "target" string, or dexecute the previous code for each line if you are sure the target string is not going to appear at the end of some line and the begining of the next line

If you are using Java 7, then according to this, you can get a String with the whole file in it:
String text = new String(Files.readAllBytes(Paths.get("file")), StandardCharsets.UTF_8);
Then, you can do this:
public void print(String word)
{
String tempStr = null;
int count = 0;
while (tempStr.indexOf(word) != -1)
{
System.out.printf("Position: %d, Count: %d\r\n", tempStr.indexOf(word), ++count);
tempStr = tempStr.substring(tempStr.indexOf(word) + word.length());
}
}

For simplicity, I would read a line and use "string.split(String regex)".
while(readLine) {
String[] str = readLine.split(regex);
//you can tell based on the array, how many matches and their position.
}
You can also use util.Scanner or regex.Pattern.
But if you are looking for performance, I think 'string.indexOf' is the best approach.

Java data reader skips my empty new lines (\n)

Okay, I tried everything but I can't find answer. My data reader skips empty next line while reading from a txt file.
It is supposed to strip all the comments from the txt file and print rest of the data as it is. My reader does strip the comments & prints the data but it skips the empty new lines..
MyDataReader.java
public String readLine()
{
String buf = new String();
String readStr = new String();
int end = 0;
int done = 0;
try
{
// checks if line extraction is done and marker has non null value
while (done != 1 && marker != null)
{
readStr = theReader.readLine(); // Reads the line from standard input
if (readStr != null)
{
/* If the first character of line isnt marker */
if (readStr.length() > 0)
{
if (!readStr.substring(0, 1).equalsIgnoreCase(marker))
{
end = readStr.indexOf(marker); // checks if marker exists in the string or not
if (end > 0)
buf = readStr.substring(0, end);
else
buf = readStr;
done = 1; // String extraction is done
}
}
}
else
{
buf = null;
done = 1;
}
}
}
// catches the exception
catch (Exception e)
{
buf = null;
System.out.println(e);
}
return buf;
}
TestMyDataReader.java
String myStr = new String();
myStr = _mdr.readLine();
while (myStr != null)
{
//System.out.println("Original String : " + myStr);
System.out.println(myStr);
myStr = _mdr.readLine();
}

if (readStr.length() > 0)
That's the line of code that is skipping empty lines.

lots of issues in this code, but the main problem that you are dealing with is that new lines are not included in the readLine result. Thus, your if statement is not true (the line is in fact empty

Your reader won't include the newline characer in readStr, so reading in the line "\n" will make readStr be "", and
readStr.length() > 0
Will evaluate to false, thus skipping that line.

java regular expression getting values from a txt file [duplicate]

I am new to Java. I have one text file with below content.
`trace` -
structure(
list(
"a" = structure(c(0.748701,0.243802,0.227221,0.752231,0.261118,0.263976,1.19737,0.22047,0.222584,0.835411)),
"b" = structure(c(1.4019,0.486955,-0.127144,0.642778,0.379787,-0.105249,1.0063,0.613083,-0.165703,0.695775))
)
)
Now what I want is, I need to get "a" and "b" as two different array list.

You need to read the file line by line. It is done with a BufferedReader like this :
try {
FileInputStream fstream = new FileInputStream("input.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
int lineNumber = 0;
double [] a = null;
double [] b = null;
// Read File Line By Line
while ((strLine = br.readLine()) != null) {
lineNumber++;
if( lineNumber == 4 ){
a = getDoubleArray(strLine);
}else if( lineNumber == 5 ){
b = getDoubleArray(strLine);
}
}
// Close the input stream
in.close();
//print the contents of a
for(int i = 0; i < a.length; i++){
System.out.println("a["+i+"] = "+a[i]);
}
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
}
Assuming your "a" and"b" are on the fourth and fifth line of the file, you need to call a method when these lines are met that will return an array of double :
private static double[] getDoubleArray(String strLine) {
double[] a;
String[] split = strLine.split("[,)]"); //split the line at the ',' and ')' characters
a = new double[split.length-1];
for(int i = 0; i < a.length; i++){
a[i] = Double.parseDouble(split[i+1]); //get the double value of the String
}
return a;
}
Hope this helps. I would still highly recommend reading the Java I/O and String tutorials.

You can play with split. First find the line in the text that matches "a" (or "b"). Then do something like this:
Array[] first= line.split("("); //first[2] will contain the values
Then:
Array[] arrayList = first[2].split(",");
You will have the numbers in arrayList[]. Be carefull with the final brackets )), because they have a "," right after. But that is code depuration and it is your mission. I gave you the idea.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java replace characters in a TextFile - Alice In Wonderland - java

Related

Retrieve number of lines in file from JFileChooser Java

How to continue processing a file when I reach a null String [duplicate]

Search for appearances of string inside text

Java data reader skips my empty new lines (\n)

java regular expression getting values from a txt file [duplicate]

Categories

Resources