Jsoup java rewrites the file string which it should add - java

code that should read html file and write the result another file the buffered writer writes the file but when the code is run with different urlit doesn't appends but rewrites the file and the previous content disappears
the solution recuired is that when jsoup iterates new html the result should add to output file and not rewrite
changed different writer types other than buffered writer
public class WriteFile
{
public static void main(String args[]) throws IOException
{
String url = "http://www.someurl.com/registers";
Document doc = Jsoup.connect(url).get();
Elements es = doc.getElementsByClass("a_code");
for (Element clas : es)
{
System.out.println(clas.text());
BufferedWriter writer = new BufferedWriter(new FileWriter("D://Author.html"));
writer.append(clas.text());
writer.close();
}
}
}

Don't mistake the append-method of the BufferedWriter as appending content to the file. It actually appends to the given writer.
To actually append additional content to the file you need to specify that when opening the file writer. FileWriter has an additional constructor parameter allowing to specify that:
new FileWriter("D://Author.html", /* append = */ true)
You may even be interested in the Java Files API instead, so you can spare instantating your own BufferedWriter, etc.:
Files.write(Paths.get("D://Author.html"), clas.text().getBytes(), StandardOpenOption.CREATE, StandardOpenOption.APPEND);
Your loop and what you are writing may further be simplifiable to something as follows (you may then even omit the APPEND-open option again, if that makes sense):
Files.write(Paths.get("D://Author.html"),
String.join("" /* or new line? */,
doc.getElementsByClass("a_code")
.eachText()
).getBytes(),
StandardOpenOption.CREATE, StandardOpenOption.APPEND);

Related

FileInputStream and FileOutputStream: Read and write to the same file

I created a text file with the content "Hello" and I was trying to read these characters from the file and write it back to the same file again.
Assumptions:
1. the file now has the content "Hello" (Overwritten)
2. the file now has the content "HelloHello" (Appended)
3. the file now has the content infinite "Hello" (or an exception gets thrown)
Actual result:
Original "Hello" characters gets deleted from the text file, and the file was left empty.
Actual test
#Test
public void testCopyStream() throws IOException {
File workingDir = new File(System.getProperty("user.dir"));
File testFile = new File(workingDir, "/test.txt");
FileReader fin = new FileReader(testFile);
FileWriter fos = new FileWriter(testFile);
copyStream(fin, fos);
fin.close();
fos.close();
}
I have created the following method for "copying" the data in the InputStream to the OutputStream:
private void copyStream(Reader in, Writer out) throws IOException {
int b;
while ((b = in.read()) != -1) {
out.write(b);
}
}
I tried using debugger to find out the problem, and the debugger shows b = in.read() was assigned -1 at the first iteration of the while loop. Then I executed the code step by step while inspecting the file's content and found that "Hello" keyword got deleted from the file right after statementfinal FileWriter fos = new FileWriter(testFile); gets executed.
I first thought this was due to the InputStream and OutputStream were pointed to the same file so the file gets sort of "locked" by JVM for execution safety?
So I tried swapping those two lines:
FileWriter fos = new FileWriter(testFile);
FileReader fin = new FileReader(testFile);
And the result turned out the same: the file content got eliminated right after the statement FileWriter fos = new FileWriter(testFile);
My questions is: why the content gets cleaned out by FileWriter?. Is this some behavior related to FileDescriptor? Is there a way to read and write to the same file?
Just FYI,
copyStream() method is working fine, I have tested it with other tests.
It's not about using append() method instead of write()
The statement FileWriter fos = new FileWriter(testFile); truncates the existing file.
It does not make sense for you to use streaming access to read and write the same file, as this won't give reliable results. Use RandomAccessFile if you want to read / write the same file: this has calls to seek current position and perform read or writes at different positions of a file.
https://docs.oracle.com/javase/7/docs/api/java/io/RandomAccessFile.html
FileWriter actually deletes everything in a file before writing. To preserve the text, use
new FileWriter(file, true);
The true parameter is the append parameter of the filewriter. Otherwise it will just overwrite everything

Running multiple PDF through an PDFBox program

Currently I am trying to use PDFBox in Eclipse to run multiple PDF files in a folder through a text reader that will extract certain terms and output them into a text file that I will then convert to an excel sheet. Currently I have the program and it works correctly for a single PDF file:
public static void main(String args[]) throws IOException {
//Loading an existing document
File file = new File("ADE_acetylfuranoside_120319_pfister.pdf");
PDDocument document = PDDocument.load(file);
//Instantiate PDFTextStripper class
PDFTextStripper pdfStripper = new PDFTextStripper();
//Retrieving text from PDF document
String text = pdfStripper.getText(document);
//..."Actual code that extracts text"...
PrintStream o = new PrintStream(new File("output.txt"));
PrintStream console = System.out;
System.setOut(o);
System.out.println(finalSheet);
my problem is that I want to run 500 PDFs in one folder through this program on eclipse rather than putting in the name of each one individually. I also want it to output like:
Name1, Number1, ID1
Name2, Number2, ID2
but I think the way it is written now it will just overwrite line number one if I run multiple PDFs though it.
Thanks for the help!
For the first part, you could just use the File class with a FileFilter:
// directoryName could be as simple a "."
File folder = new File(directoryName);
File[] listOfFiles = folder.listFiles(new FileFilter() {
#Override
public boolean accept(File pathname) {
return pathname.getName().toLowerCase().endsWith(".pdf");
}
});
This gives you an array of File objects of all the files in a particular folder/directory. Now you can loop through it with pretty much the code you have.
On the output side, you'll likely want to correlate the output with the input. I'm a bit confused by your code and I'm guessing you'd just like an output file for each input file. So, perhaps, something like:
// index is the value you used to loop through the `listOfFiles` array
try( FileWriter fileWriter = new FileWriter(listOfFiles[index].getName() + ".output.txt" ) ) {
fileWriter.write( // the String text you want in the file );
}
This creates a file named (as taken from your example) "ADE_acetylfuranoside_120319_pfister.pdf.output.txt". Obviously this could change. In this case a new file is created for each input file.

Java create a new file, or, override the existing file

What I want to achieve is to create a file regardless of whether the file exists or not.
I tried using File.createNewFile() but that will only create the file if it does not already exists. Should I use File.delete() and then File.createNewFile()?
Or is there a clearer way of doing it?
FileWriter has a constructor that takes 2 parameters too: The file name and a boolean. The boolean indicates whether to append or overwrite an existing file. Here are two Java FileWriter examples showing that:
Writer fileWriter = new FileWriter("c:\\data\\output.txt", true); //appends to file
Writer fileWriter = new FileWriter("c:\\data\\output.txt", false); //overwrites file
You can use a suitable Writer:
BufferedWriter br = new BufferedWriter(new FileWriter(new File("abc.txt")));
br.write("some text");
It will create a file abc.txt if it doesn't exist. If it does, it will overwrite the file.
You can also open the file in append mode by using another constructor of FileWriter:
BufferedWriter br = new BufferedWriter(new FileWriter(new File("abc.txt"), true));
br.write("some text");
The documentation for above constructor says:
Constructs a FileWriter object given a File object. If the second
argument is true, then bytes will be written to the end of the file
rather than the beginning.
Calling File#createNewFile is safe, assuming the path is valid and you have write permissions on it. If a file already exists with that name, it will just return false:
File f = new File("myfile.txt");
if (f.createNewFile()) {
// If there wasn't a file there beforehand, there is one now.
} else {
// If there was, no harm, no foul
}
// And now you can use it.

Program Not Creating File. What is Wrong?

I have tried creating a file, using the code below:
import java.io.File;
public class DeleteEvidence {
public static void main(String[] args) {
File evidence = new File("cookedBooks.txt");
However, the file cookedBooks.txt does not exist anywhere on my computer. I'm pretty new to this, so I'm having problems understanding other threads about similar problems.
You have successfully created an instance of the class File, which is very different from creating actual files in your hard drive.
Instances of the File class are used to refer to files on the disk. You can use them to many things, for instance:
check if files or directories exist;
create/delete/rename files or directories; and
open "streams" to write data into the files.
To create a file in your hard disk and write some data to it, you could use, for instance, FileOutputStream.
public class AnExample {
public static void main(String... args) throws Throwable {
final File file = new File("file.dat");
try (FileOutputStream fos = new FileOutputStream(file);
DataOutputStream out = new DataOutputStream(fos)) {
out.writeInt(42);
}
}
}
Here, fos in an instance of FileOutputStream, which is an OutputStream that writes all bytes written to it to an underlying file on disk.
Then, I create an instance of DataOutputStream around that FileOutputStream: this way, we can write more complex data types than bytes and byte arrays (which is your only possibility using the FileOutputStream directly).
Finally, four bytes of data are written to the file: the four bytes representing the integer 42. Note that, if you open this file on a text editor, you will see garbage, since the code above did not write the characters '4' and '2'.
Another possibility would have been to use an OutputStreamWriter, which would give you an instance of Writer that can be used to write text (non-binary) files:
public class AnExample {
public static void main(String... args) throws Throwable {
final File file = new File("file.txt");
try (FileOutputStream fos = new FileOutputStream(file);
OutputStreamWriter out = new OutputStreamWriter(fos, StandardCharsets.UTF_8)) {
out.write("You can read this with a text editor.");
}
}
}
Here, you can open the file file.txt on a text editor and read the message written to it.
File evidence = new File(path);
evidence.mkdirs();
evidence.createNewFile();
File is an abstract concept of a file which does not have to exist. Simply creating a File object does not actually create a physical object.
You can do this in (at least) two ways.
Write something to the file (reference by the abstract File object)
Calling File#createNewFile
You can also create temporary files using File#createTempFile but I don't think this is what you are trying to achieve.
You have only created an object which can represent a file. This is just in memory though. If you want to access the file you must us ea FileInputStream or a FileOutputStream. Then it will also be created on the drive (in case of the outputstream).
FileOutputStream fo = new FileOutputStream(new File(oFileName));
fo.write("test".getBytes());
fo.close();
This is just ur creating file object by using this object u need to call one method i.e createFile() method..
So use evidence.createNewFile(); if you are creating just file.
else if u want to create file in any specific location then specify your file name
i.e File evidence=new File("path");
In this case if ur specifying any directoty
String path="abc.txt";
File file = new File(path);
if (file.createNewFile()) {
System.out.println("File is created");
}
else {
System.out.println("File is already created");
}
FileWriter fw = new FileWriter(file, true);
string ab="Hello";
fw.write(ab);
fw.write(summary);
fw.close();

File Delete and Rename in Java

I have the following Java code which will search in an xml for a specific tag and then will add some text to it and save that file. I couldnt find a way to rename the emporary file to the original file. Please suggest.
import java.io.*;
class ModifyXML {
public void readMyFile(String inputLine) throws Exception
{
String record = "";
File outFile = new File("tempFile.tmp");
FileInputStream fis = new FileInputStream("InfectiousDisease.xml");
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
FileOutputStream fos = new FileOutputStream(outFile);
PrintWriter out = new PrintWriter(fos);
while ( (record=br.readLine()) != null )
{
if(record.endsWith("<add-info>"))
{
out.println(" "+"<add-info>");
out.println(" "+inputLine);
}
else
{
out.println(record);
}
}
out.flush();
out.close();
br.close();
//Also we need to delete the original file
//outFile.renameTo(InfectiousDisease.xml);//Not working
}
public static void main (String[] args) {
try
{
ModifyXML f = new ModifyXML();
f.readMyFile("This is infectious disease data");
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
Thanks
First delete the original file and then rename the new file:
File inputFile = new File("InfectiousDisease.xml");
File outFile = new File("tempFile.tmp");
if(inputFile.delete()){
outFile.renameTo(inputFile);
}
A good method to rename files is.
File file = new File("path-here");
file.renameTo(new File("new path here"));
In your code there are several issues.
First your description mentions renameing the original file and adding some text to it. Your code doesn't do that, it opens two files, one for reading and one for writing (with the additional text). That is the right way to do things, as adding text in-place is not really feasible using the techniques you are using.
The second issue is that you are opening a temporary file. Temporary files remove themselves upon closing, so all the work you did adding your text disappears as soon as you close the file.
The third issue is that you are modifying XML files as plain text. This sometimes works as XML files are a subset of plain text files, but there is no indication that you attempted to ensure that the output file was an XML file. Perhaps you know more about your input files than is mentioned, but if you want this to work correctly for 100% of the input cases, you probably want to create a SAX writer that writes out all a SAX reader reads, with the additional information in the correct tag location.
You can use
outFile.renameTo(new File(newFileName));
You have to ensure these files are not open at the time.

Categories

Resources