I have a program to pull the source code of a webpage and save it to a .txt file. It works if done with just one at a time, but when I go through a loop of say 100 pages all of a sudden each page source starts to get cut off between 1/4 and 3/4 of the way through (seems to be arbitrary). Any ideas on why or how I would go about solving this?
Initial thoughts where that the loop is going too fast for the java (I am running this java from a php script) but then thought that it technically shouldn't be going to the next item until the current condition was finished anyway.
Here is the code I'm using:
import java.io.*;
import java.net.URL;
public class selectout {
public static BufferedReader read(String url) throws Exception{
return new BufferedReader(
new InputStreamReader(
new URL(url).openStream()));}
public static void main (String[] args) throws Exception{
BufferedReader reader = read(args[0]);
String line = reader.readLine();
String thenum = args[1];
FileWriter fstream = new FileWriter(thenum+".txt");
BufferedWriter out = new BufferedWriter(fstream);
while (line != null) {
out.write(line);
out.newLine();
//System.out.println(line);
line = reader.readLine(); }}
}
The PHP is a basic mysql_query while(fetch_assoc) grab the url from the database, then run system("java -jar crawl.jar $url $filename");
Then, it fopen and fread the new file, and finally saves the source to database (after escaping_strings and such).
You need to close your output streams after you finish writing each file. After your while loop, call out.close(); and fstream.close();
You must flush the stream and close it.
finally{ //Error handling ignored in my example
fstream.flush();
fstream.close();
}
Related
I used a regular expression to parse a text file to use the resulted group one and two as follows:
write group two in another file
make its name to be group one
Unfortunately, No data is written on the file!
I did not figure out where is the problem, here is my code:
package javaapplication5;
import java.io.*;
import java.util.regex.*;
public class JavaApplication5 {
public static void main(String[] args) {
// TODO code application logic here
try {
FileInputStream fstream = new FileInputStream("C:/Users/Welcome/Desktop/End-End-Delay.txt");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
File newFile1= new File("C:/Users/Welcome/Desktop/AUV1.txt");
FileOutputStream fos1= new FileOutputStream(newFile1);
BufferedWriter bw1= new BufferedWriter(new OutputStreamWriter(fos1));
String strLine;
while ((strLine = br.readLine()) != null) {
Pattern p = Pattern.compile("sender\\sid:\\s(\\d+).*?End-End\\sDelay:(\\d+(?:\\.\\d+)?)");
Matcher m = p.matcher(strLine);
while (m.find()) {
String b = m.group(1);
String c = m.group(2);
int i = Integer.valueOf(b);
if(i==0){
System.out.println(b);
bw1.write(c);
bw1.newLine();
}
System.out.println(b);
// System.out.println(c);
}
}
}
catch (Exception e) {
System.err.println("Error: " + e.getMessage());
}
}
}
Can anyone here help me to solve this problem and Identify it?
You are using BufferedWriter, and never flush (flushing writer pushes the contents on disk) your writer or even close it at the end of your program.
Due to which, before your content gets written in actual file on disk from BufferedWriter, the program exits and the contents get lost.
To avoid this, either you can call flush just after writing contents in bw1,
bw1.write(c);
bw1.newLine();
bw1.flush();
OR
Before your program ends, you should call,
bw1.close(); // this ensures all content in buffered writer gets push to disk before jvm exists
Calling flush every time you write the data is not really recommended, as it defeats the purpose of buffered writing.
So best is to close the buffered writer object. You can do it in two ways,
Try-with-resources
Manually close the buffered writer object in the end, likely in the finally block so as to ensure it gets called.
Besides all this, you need to ensure that your regex matches and your condition,
if(i==0){
gets executed else code that is writing data in file won't get executed and of course in that case no write will happen in file.
Also, it is strongly recommended to close any of the resources you open like file resources, database (Connection, Statements, ResultSets) resources etc.
Hope that helps.
I wrote a simple program to read the content from text/log file to html with conditional formatting.
Below is my code.
import java.io.*;
import java.util.*;
class TextToHtmlConversion {
public void readFile(String[] args) {
for (String textfile : args) {
try{
//command line parameter
BufferedReader br = new BufferedReader(new FileReader(textfile));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
Date d = new Date();
String dateWithoutTime = d.toString().substring(0, 10);
String outputfile = new String("Test Report"+dateWithoutTime+".html");
FileWriter filestream = new FileWriter(outputfile,true);
BufferedWriter out = new BufferedWriter(filestream);
out.write("<html>");
out.write("<body>");
out.write("<table width='500'>");
out.write("<tr>");
out.write("<td width='50%'>");
if(strLine.startsWith(" CustomerName is ")){
//System.out.println("value of String split Client is :"+strLine.substring(16));
out.write(strLine.substring(16));
}
out.write("</td>");
out.write("<td width='50%'>");
if(strLine.startsWith(" Logged in users are ")){
if(!strLine.substring(21).isEmpty()){
out.write("<textarea name='myTextBox' cols='5' rows='1' style='background-color:Red'>");
out.write("</textarea>");
}else{
System.out.println("else if block:");
out.write("<textarea name='myTextBox' cols='5' rows='1' style='background-color:Green'>");
out.write("</textarea>");
} //closing else block
//out.write("<br>");
out.write("</td>");
}
out.write("</td>");
out.write("</tr>");
out.write("</table>");
out.write("</body>");
out.write("</html>");
out.close();
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
}
}
}
public static void main(String args[]) {
TextToHtmlConversion myReader = new TextToHtmlConversion();
String fileArray[] = {"D:/JavaTesting/test.log"};
myReader.readFile(fileArray);
}
}
I was thinking to enhance my program and the confusion is of either i should use Maps or properties file to store search string. I was looking out for a approach to avoid using substring method (using index of a line). Any suggestions are truly appreciated.
From top to bottom:
Don't use wildcard imports.
Don't use the default package
restructure your readFile method in more smaller methods
Use the new Java 7 file API to read files
Try to use a try-block with a resource (your file)
I wouldn't write continuously to a file, write it in the end
Don't catch general Exception
Use a final block to close resources (or the try block mentioned before)
And in general: Don't create HTML by appending strings, this is a bad pattern for its own. But well, it seems that what you want to do.
Edit
Oh one more: Your text file contains some data right? If your data represents some entities (or objects) it would be good to create a POJO for this. I think your text file contains users (right?). Then create a class called Users and parse the text file to get a list of all users in it. Something like:
List<User> users = User.parse("your-file.txt");
Afterwards you have a nice user object and all your ugly parsing is in one central point.
I know about the new WatchDir feature, but I want the changes made in the FILE and not Directory to be written into a log file. Any changes made to it are written directly into log.txt file. The current code I have: http://pastebin.com/GwURfRbi , writes only the last line into txt file because it is reading only one line.
I need to tweak it in such a way that it reads a line, if changes is made writes into file,
then again keeps reading, and as soon as any change is made, it is written in the txt file instantly. Can anyone help?
code:
import java.io.*;
public class LogMonitor {
public static void main(String[] args) throws Exception {
FileReader fr = new FileReader("D:/test.txt");
BufferedReader br = new BufferedReader(fr);
while (true) {
String line = br.readLine();
if (line == null) {
Thread.sleep(1*1000);
} else {
byte[] y = line.getBytes();
File g = new File("D:/abc.txt");
OutputStream f = new FileOutputStream(g);
f.write( y );
}
}
}
}
I think your problem is that you did not set the append boolean to true in your OutputStream. Try this:
OutputStream f = new FileOutputStream(g, true);
so that it appends the line instead of overwriting the whole file.
I'm trying to write a program that reads a file (which is a Java source file), makes an Arraylist of certain specified values from that file. and outputs that Arraylist into another resulting file.
I'm using PrintWriter to make the new resulting file. This is a summarised version of my program:
ArrayList<String> exampleArrayList = new ArrayList<String>();
File actualInputFile = new File("C:/Desktop/example.java");
PrintWriter resultingSpreadsheet= new PrintWriter("C:/Desktop/SpreadsheetValues.txt", "UTF-8");
FileReader fr = new FileReader(actualInputFile);
BufferedReader br = new BufferedReader(fr);
String line=null;
while ((line = br.readLine()) != null) {
// code that makes ArrayList
}
for (int i = 0; i < exampleArrayList.size(); i++) {
resultingSpreadsheet.println(exampleArrayList.get(i));
}
resultingSpreadsheet.close();
The problem is that when i run this, nothing gets printed to the resultingSpreadsheet. It's completely empty.
BUT, this program works perfectly (meaning that it prints out everything correctly to the resultingSpreadsheet file) when I replace:
File actualInputFile = new File("C:/Desktop/example.java");
which is the file that I want as my input file, and which has a size of 481 KB,
with:
File smallerInputFile = new File("C:/Desktop/smallerExample.txt");
which is really just a smaller .txt example version of the .java source file, and it has a size of 1.08 KB.
I've tried a few things including flushing the PrintWriter, wrapping it around FileWriter, copy-pasting all the code from the .java file into a text file in case it was an extension problem, but these don't seem to work.
I'm starting to think it must be because of the size of the file that the PrintWriter makes, but it's very possible that that's not the problem. Perhaps I need to put everything in a stream (like it says here: http://docs.oracle.com/javase/6/docs/api/java/io/PrintWriter.html)? If so, how would I do that?
Why is reading the bigger actualInputFile and outputting its data correctly such a problem, when everything works fine for the smallerInputFile?
Can anyone help with this?
Check for exceptions while writing to the the excel sheet , because i really don't think its a problem of size. Below is the sample code that is executing successfully and the file size was approx 1 MB.
public class Test {
/**
* #param args
*/
public static void main(String[] args) {
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("D:\\AdminController.java"));
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
This should go as a comment, but I do not have the rep. In the documentation it has both write methods and print methods. Have you tried using write() instead?
I doubt it's the size of the file, it may be between the two files you are testing one is .txt, and the other is .java
EDIT: Probably second suggestion of the two. First is just something I noticed with the docs.
The methods of PrintWriter do not throw Exception. Call the checkError() method which would flush the stream as well as return true if an error occurred. It is quite possible that an error occurred processing the larger file, an encoding error for instance.
Check your program. When the file is empty it means that your program doesn't close the PrintWriter before finishing the program.
For example you may have a return in a part of your program which cause that resultingSpreadsheet.close(); have not being run.
import java.io.*;
import java.util.*;
public class Readfilm {
public static void main(String[] args) throws IOException {
ArrayList films = new ArrayList();
File file = new File("filmList.txt");
try {
Scanner scanner = new Scanner(file);
while (scanner.hasNext())
{
String filmName = scanner.next();
System.out.println(filmName);
}
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
}}
Above is the code I'm currently attempting to use, it compiles fine, then I get a runtime error of:
java.util.NoSuchElementException
at java.util.Scanner.throwFor(Scanner.java:907)
at java.util.Scanner.next(Scanner.java:1416)
at Readfilm.main(Readfilm.java:15)
I've googled the error and not had anything that helped (I only googled the first 3 lines of the error)
Basically, the program I'm writing is part of a bigger program. This part is to get information from a text file which is written like this:
Film one / 1.5
Film two / 1.3
Film Three / 2.1
Film Four / 4.0
with the text being the film title, and the float being the duration of the film (which will have 20 minutes added to it (For adverts) and then will be rounded up to the nearest int)
Moving on, the program is then to put the information in an array so it can be accessed & modified easily from the program, and then written back to the file.
My issues are:
I get a run time error currently, not a clue how to fix? (at the moment I'm just trying to read each line, and store it in an array, as a base to the rest of the program) Can anyone point me in the right direction?
I have no idea how to have a split at "/" I think it's something like .split("/")?
Any help would be greatly appreciated!
Zack.
Your code is working but it reads just one line .You can use bufferedReader here is an example import java.io.*;
class FileRead
{
public static void main(String args[])
{
try{
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream("textfile.txt");
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println (strLine);
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
And here is an split example class StringSplitExample {
public static void main(String[] args) {
String st = "Hello_World";
String str[] = st.split("_");
for (int i = 0; i < str.length; i++) {
System.out.println(str[i]);
}
}
}
I wouldn't use a Scanner, that's for tokenizing (you get one word or symbol at a time). You probably just want to use a BufferedReader which has a readLine method, then use line.split("/") as you suggest to split it into two parts.
Lazy solution :
Scanner scan = ..;
scan.nextLine();