RandomAccessFile.seek() not working on Linux - java

I am using some sort of tail -f implementation to tail a file for changes (pretty much like this ). For this I am using a RandomAccessFile, periodically check if the file length has increased and if so, seek and read the new lines (everything happening in a separate thread of the FileTailer).
Now, everything is working as expected on Windows, but I tested my program on Linux and it does not work as expected. Here is the run()-method of the FileTailer class. Specifically where it fails on linux is the part where file.seek(filePointer) gets called and then file.readLine(), of which the latter surprisingly returns NULL (although the filePointer gets incremented correctly if I append content to the file getting tailed at runtime).
public void run() {
// The file pointer keeps track of where we are in the file
long filePointer = 0;
// Determine start point
if(startAtBeginning){
filePointer = 0;
}
else {
filePointer = logfile.length();
}
try {
// Start tailing
tailing = true;
RandomAccessFile file = new RandomAccessFile(logfile, "r");
while(tailing) {
// Compare the length of the file to the file pointer
long fileLength = logfile.length();
System.out.println("filePointer = " + filePointer + " | fileLength = " + fileLength);
if(fileLength < filePointer) {
// Log file must have been rotated or deleted;
// reopen the file and reset the file pointer
file = new RandomAccessFile(logfile, "r");
filePointer = 0;
}
if(fileLength > filePointer) {
// There is data to read
file.seek(filePointer);
String line = file.readLine();
System.out.println("new line = " + line);
while(line != null){
if(!line.isEmpty())
try {
fireNewFileLine(line);
} catch (ParseException e) {
e.printStackTrace();
}
line = file.readLine();
}
filePointer = file.getFilePointer();
}
// Sleep for the specified interval
sleep(sampleInterval);
}
// Close the file that we are tailing
file.close();
}
catch(InterruptedException | IOException e){
e.printStackTrace();
}
}
Like I said, everything is working as it should on Windows, but on Linux the String variable "line" is NULL after it should have been filled with the newly appended line, so fireNewLine gets called on NULL and everything goes to crap.
Does anyone have an idea why this happens on Linux Systems?

You don't need all this, or RandomAccessFile. You are always at the end of the file. All you need is this:
public void run() {
try {
// Start tailing
tailing = true;
BufferedReader reader = new BufferedReader(new FileReader(logfile));
String line;
while (tailing) {
while ((line = reader.readLine() != null) {
System.out.println("new line = " + line);
if(!line.isEmpty()) {
try {
fireNewFileLine(line);
} catch (ParseException e) {
e.printStackTrace();
}
}
}
// Sleep for the specified interval
sleep(sampleInterval);
}
// Close the file that we are tailing
reader.close();
} catch(InterruptedException | IOException e) {
e.printStackTrace();
}
}
with maybe some provision for reopening the file.
E&OE

Related

Reading Large file A, Search Records matching records from file B and write file C in java

I have two files assume its already sorted.
This is just example data, in real ill have around 30-40 Millions of records each file Size 7-10 GB file as row length is big, and fixed.
It's a simple text file, once searched record is found. ill do some update and write to file.
File A may contain 0 or more records of matching ID from File B
Motive is to complete this processing in least amount of time possible.
I am able to do but its time taking process...
Suggestions are welcome.
File A
1000000001,A
1000000002,B
1000000002,C
1000000002,D
1000000002,D
1000000003,E
1000000004,E
1000000004,E
1000000004,E
1000000004,E
1000000005,E
1000000006,A
1000000007,A
1000000008,B
1000000009,B
1000000010,C
1000000011,C
1000000012,C
File B
1000000002
1000000004
1000000006
1000000008
1000000010
1000000012
1000000014
1000000016
1000000018\
// Not working as of now. due to logic is wrong.
private static void readAndWriteFile() {
System.out.println("Read Write File Started.");
long time = System.currentTimeMillis();
try(
BufferedReader in = new BufferedReader(new FileReader(Commons.ROOT_PATH+"input.txt"));
BufferedReader search = new BufferedReader(new FileReader(Commons.ROOT_PATH+"search.txt"));
FileWriter myWriter = new FileWriter(Commons.ROOT_PATH+"output.txt");
) {
String inLine = in.readLine();
String searchLine = search.readLine();
boolean isLoopEnd = true;
while(isLoopEnd) {
if(searchLine == null || inLine == null) {
isLoopEnd = false;
break;
}
if(searchLine.substring(0, 10).equalsIgnoreCase(inLine.substring(0,10))) {
System.out.println("Record Found - " + inLine.substring(0, 10) + " | " + searchLine.substring(0, 10) );
myWriter.write(inLine + System.lineSeparator());
inLine = in.readLine();
}else {
inLine = in.readLine();
}
}
in.close();
myWriter.close();
search.close();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Read and Write to File done in - " + (System.currentTimeMillis() - time));
}
My suggestion would be to use a database. As said in this answer. Using txt files has a big disadvantage over DBs. Mostly because of the lack of indexes and the other points mentioned in the answer.
So what I would do, is create a Database (there are lots of good ones out there such as MySQL, PostgreSQL, etc). Create the tables that are needed, and read the file afterward. Insert each line of the file into the DB and use the db to search and update them.
Maybe this would not be an answer to your concrete question on
Motive is to complete this processing in the least amount of time possible.
But this would be a worthy suggestion. Good luck.
With this approach I am able to process 50M Records in 150 Second on i-3, 4GB Ram and SSD Hardrive.
private static void readAndWriteFile() {
System.out.println("Read Write File Started.");
long time = System.currentTimeMillis();
try(
BufferedReader in = new BufferedReader(new FileReader(Commons.ROOT_PATH+"input.txt"));
BufferedReader search = new BufferedReader(new FileReader(Commons.ROOT_PATH+"search.txt"));
FileWriter myWriter = new FileWriter(Commons.ROOT_PATH+"output.txt");
) {
String inLine = in.readLine();
String searchLine = search.readLine();
boolean isLoopEnd = true;
while(isLoopEnd) {
if(searchLine == null || inLine == null) {
isLoopEnd = false;
break;
}
// Since file is already sorted, i was looking for the //ans i found here..
long seachInt = Long.parseLong(searchLineSubString);
long inInt = Long.parseLong(inputLineSubString);
if(searchLine.substring(0, 10).equalsIgnoreCase(inLine.substring(0,10))) {
System.out.println("Record Found - " + inLine.substring(0, 10) + " | " + searchLine.substring(0, 10) );
myWriter.write(inLine + System.lineSeparator());
}
// Which pointer to move..
if(seachInt < inInt) {
searchLine = search.readLine();
}else {
inLine = in.readLine();
}
}
in.close();
myWriter.close();
search.close();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("Read and Write to File done in - " + (System.currentTimeMillis() - time));
}

Java too many open files exception

I have a problem on my code; basically I have an array containing some key:
String[] ComputerScience = { "A", "B", "C", "D" };
And so on, containing 40 entries.
My code reads 900 pdf from 40 folder corresponding to each element of ComputerScience, manipulates the extracted text and stores the output in a file named A.txt , B.txt, ecc ...
Each folder "A", "B", ecc contains 900 pdf.
After a lot of documents, an exception "Too many open files" is thrown.
I'm supposing that I am correctly closing files handler.
static boolean writeOccurencesFile(String WORDLIST,String categoria, TreeMap<String,Integer> map) {
File dizionario = new File(WORDLIST);
FileReader fileReader = null;
FileWriter fileWriter = null;
try {
File cat_out = new File("files/" + categoria + ".txt");
fileWriter = new FileWriter(cat_out, true);
} catch (IOException e) {
e.printStackTrace();
}
try {
fileReader = new FileReader(dizionario);
} catch (FileNotFoundException e) { }
try {
BufferedReader bufferedReader = new BufferedReader(fileReader);
if (dizionario.exists()) {
StringBuffer stringBuffer = new StringBuffer();
String parola;
StringBuffer line = new StringBuffer();
int contatore_index_parola = 1;
while ((parola = bufferedReader.readLine()) != null) {
if (map.containsKey(parola) && !parola.isEmpty()) {
line.append(contatore_index_parola + ":" + map.get(parola).intValue() + " ");
map.remove(parola);
}
contatore_index_parola++;
}
if (! line.toString().isEmpty()) {
fileWriter.append(getCategoryID(categoria) + " " + line + "\n"); // print riga completa documento N x1:y x2:a ...
}
} else { System.err.println("Dictionary file not found."); }
bufferedReader.close();
fileReader.close();
fileWriter.close();
} catch (IOException e) { return false;}
catch (NullPointerException ex ) { return false;}
finally {
try {
fileReader.close();
fileWriter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return true;
}
But the error still comes. ( it is thrown at:)
try {
File cat_out = new File("files/" + categoria + ".txt");
fileWriter = new FileWriter(cat_out, true);
} catch (IOException e) {
e.printStackTrace();
}
Thank you.
EDIT: SOLVED
I found the solution, there was, in the main function in which writeOccurencesFile is called, another function that create a RandomAccessFile and doesn't close it.
The debugger sais that Exception has thrown in writeOccurencesFile but using Java Leak Detector i found out that the pdf were already opened and not close after parsing to pure text.
Thank you!
Try using this utility specifically designed for the purpose.
This Java agent is a utility that keeps track of where/when/who opened files in your JVM. You can have the agent trace these operations to find out about the access pattern or handle leaks, and dump the list of currently open files and where/when/who opened them.
When the exception occurs, this agent will dump the list, allowing you to find out where a large number of file descriptors are in use.
i have tried using try-with resources; but the problem remains.
Also running in system macos built-in console print out a FileNotFound exception at the line of FileWriter fileWriter = ...
static boolean writeOccurencesFile(String WORDLIST,String categoria, TreeMap<String,Integer> map) {
File dizionario = new File(WORDLIST);
try (FileWriter fileWriter = new FileWriter( "files/" + categoria + ".txt" , true)) {
try (FileReader fileReader = new FileReader(dizionario)) {
try (BufferedReader bufferedReader = new BufferedReader(fileReader)) {
if (dizionario.exists()) {
StringBuffer stringBuffer = new StringBuffer();
String parola;
StringBuffer line = new StringBuffer();
int contatore_index_parola = 1;
while ((parola = bufferedReader.readLine()) != null) {
if (map.containsKey(parola) && !parola.isEmpty()) {
line.append(contatore_index_parola + ":" + map.get(parola).intValue() + " ");
map.remove(parola);
}
contatore_index_parola++;
}
if (!line.toString().isEmpty()) {
fileWriter.append(getCategoryID(categoria) + " " + line + "\n"); // print riga completa documento N x1:y x2:a ...
}
} else {
System.err.println("Dictionary file not found.");
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
return true;
}
This is the code that i am using now, although the bad managing of Exception, why the files seem to be not closed?
Now i am making a test with File Leak Detector
Maybe your code raises another exception that you are not handling. Try add catch (Exception e) before finally block
You also can move BufferedReader declaration out the try and close it in finally

Gdx.files.internal(...) wrapper not working correctly

I made a wrapper ConfigurationFile class to help handle Gdx.files stuff, and it worked fine for a long time, but now it's not working, and I don't know why.
I have two of the following two methods: internal(...) and local(...). The only difference between the two is handling the load from arguments from (File folder, String name) and (String path).
-Snip Now Unnecessary Information-
UPDATE
After more configuring, I came to find out that they're not behaving the same. I have an assets/files/ folder that Gdx.files.internal(...) will access fine, but ConfigurationFile.internal(...) will access files/, and they're set up the same way. I'll give you the two pieces of code that I used for testing.
Using Gdx.files.internal(...) directly (works as expected):
FileHandle handle = Gdx.files.internal("files/virus_data");
BufferedReader reader = null;
try {
reader = new BufferedReader(handle.reader());
String c = "";
while ((c = reader.readLine()) != null) {
System.out.println(c); // prints out all 5 lines on the file.
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (reader != null) reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
Using ConfigurationFile.internal(...):
// First part, calls ConfigurationFile#internal(String path)
ConfigurationFile config = ConfigurationFile.internal("files/virus_data");
// ConfigurationFile#internal(String path)
public static ConfigurationFile internal(String path) {
ConfigurationFile config = new ConfigurationFile();
// This is literally calling Gdx.files.internal("files/virus_data");
config.handle = Gdx.files.internal(path);
config.file = config.handle.file();
config.folder = config.file.getParentFile();
config.init();
return config;
}
// ConfigurationFile#init()
protected void init() {
// File not found.
// Creates a new folder as a sibling of "assets"
// Creates a new file called "virus_data"
if (!folder.exists()) folder.mkdirs();
if (!file.exists()) {
try {
file.createNewFile();
} catch (IOException e) {
e.printStackTrace();
}
} else loadFile();
}
// ConfigurationFile#loadFile()
protected void loadFile() {
BufferedReader reader = null;
try {
reader = new BufferedReader(handle.reader());
String c = "";
while ((c = reader.readLine()) != null) {
System.out.println(c);
if (!c.contains(":")) continue;
String[] values = c.split(":");
String key = values[0];
String value = values[1];
if (values.length > 2) {
for (int i = 2; i < values.length; i++) {
value += ":" + values[i];
}
}
key = key.trim();
value = value.trim();
mapValues.put(key, value);
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (reader != null) reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
What I'm having trouble understanding is what's the difference between these two ways that it is causing my ConfigurationFile to create a new File in a folder that is a sibling of assets. Could someone tell me why this is happening?
My suggestion is not to use
Gdx.files.internal(folder + "/" + name);
If you have to use the File api, do it this way:
Gdx.files.internal(new File(folder, name).toString());
This way you avoid weird things that could be happening with path separators.
If Gdx maybe needs relative paths for some reason (perhaps relative to some Gdx internal home directory), you could use NIO to do something like
final Path gdxHome = Paths.get("path/to/gdx/home");
//...
File combined = new File(folder, name);
String relativePath = gdxHome.relativize(combined.toPath()).toString();
Okay, so after intense testing, I found out the problem, which I found to be ridiculous.
Since the file is Internal, that means a new File(...) reference can't be properly made to it, but instead it's an InputStream (if I'm correct), but anyways, using the method FileHandle#file() on an Internal file causes some kind of conversion for the path, so after removing anything that dealed with FileHandle#file() for an Internal file fixed it.

Problem updating list of data

private static void deleteProxy(File proxyOld, String host, int port) {
try {
String lines, tempAdd;
boolean removeLine = false;
File proxyNew = new File("proxies_" + "cleaner$tmp");
BufferedReader fileStream = new BufferedReader(new InputStreamReader(new FileInputStream(proxyOld)));
BufferedWriter replace = new BufferedWriter(new FileWriter(proxyNew));
while ((lines = fileStream.readLine()) != null) {
tempAdd = lines.trim();
if (lines.trim().equals(host + ":" + port)) {
removeLine = true;
}
if (!removeLine) {
replace.write(tempAdd);
replace.newLine();
}
}
fileStream.close();
replace.close();
proxyOld.delete();
proxyNew.renameTo(proxyOld);
} catch (Exception e) {
e.printStackTrace();
}
}
Calling the function:
File x = new File("proxies.txt");//is calling a new file the reason why it's being flushed out?
deleteProxy(x, host, port);
Before I run the program the file proxies.txt had data inside of it. However when I run the program it appears to be flushed out. It becomes empty.
I noticed while the program is running, if I move my mouse over the file proxies.txt, Windows displays the "Date Modified" and the time it displays is the current time, or last time the function deleteProxy(...) was executed.
Does anyone know what I'm doing wrong? And why won't the list update instead of appearing to be empty?
Updated code:
private static void deleteProxy(File proxyOld, String host, int port) {
try {
String lines, tempAdd;
boolean removeLine = false;
File proxyNew = new File("proxies_" + "cleaner$tmp");
FileInputStream in = new FileInputStream(proxyOld);
InputStreamReader read = new InputStreamReader(in);
BufferedReader fileStream = new BufferedReader(read);
FileWriter write = new FileWriter(proxyNew);
BufferedWriter replace = new BufferedWriter(write);
while ((lines = fileStream.readLine()) != null) {
tempAdd = lines.trim();
if (lines.trim().equals(host + ":" + port)) {
removeLine = true;
}
if (!removeLine) {
replace.write(tempAdd);
replace.newLine();
}
}
in.close();
read.close();
fileStream.close();
write.close();
replace.close();
if (proxyOld.delete()) {
throw new Exception("Error deleting " + proxyOld);
}
if (proxyNew.renameTo(proxyOld)) {
throw new Exception("Error renaming " + proxyOld);
}
} catch (Exception e) {
e.printStackTrace();
}
}
Running the updated code it deletes proxies.txt just fine but it fails to make the new file:\
Maybe I should find a new method to update a text file, do you have any suggestions?
Your rename operation may not work, as per the File.renameTo() documentation:
Many aspects of the behavior of this method are inherently platform-dependent: The rename operation might not be able to move a file from one filesystem to another, it might not be atomic, and it might not succeed if a file with the destination abstract pathname already exists. The return value should always be checked to make sure that the rename operation was successful.
So basically, you're wiping your old file, and you're not guaranteed the new file will take its place. You must check the return value of File.renameTo():
if(proxyNew.renameTo(proxyOld)){
throw new Exception("Could not rename proxyNew to proxyOld");
}
As for why your renameTo may be failing: you're not closing the nested set of streams that you open to read from the old file, so the operating system may still consider an abstract pathname to exist. Try making sure you close all of the nested streams you open.
Try this:
FileInputStream in = new FileInputStream(proxyOld);
BufferedReader fileStream = new BufferedReader(new InputStreamReader(in));
...
in.close();

Running shell script from Java

I am trying to run some shell scripts for Java by using commons exec package and clear the STDOUT & STDERR buffers by using PumpStreamHandler. Most of the scripts run fine without any problems but some of them hangs.
Particularly those scripts that takes some time to return. My guess is that the PumpStramHandle might be reading end of stream as there is nothing put on the stream for a while and after that the buffers fill up.
Is there any better way to get across this problem?
Extract the script/command being executed and run it yourself in a shell. When running things that are 'exec'd through some other language(c,c++, python java etc) and things start going 'wrong' this should be the first step.
You find all sorts of things going on. Scripts that stop and prompt for input(big source of hangups) errors that don't parse correctly, seg faults, files not found.
To expand on the first answer about running the commands directly to test, you can test your hypothesis with a simple script that sleeps for a while before returning output. If you
can't test your command, test your idea.
#!/bin/bash
sleep 60;
echo "if you are patient, here is your response"
Not the best solution. But does what I need. :)
class OSCommandLogger extends Thread {
private static final Logger logger = Logger.getLogger(OSCommandLogger.class);
private volatile boolean done = false;
private final String name;
// Each process is associated with an error and output stream
private final BufferedReader outputReader;
private final BufferedReader errorReader;
private final Logger log;
/**
* Reads the output & error streams of the processes and writes them to
* specified log
*
* #param p
* #param name
* #param log
*/
OSCommandLogger(Process p, String name, Logger log) {
// Create readers
outputReader = new BufferedReader(new InputStreamReader(p.getInputStream()));
errorReader = new BufferedReader(new InputStreamReader(p.getErrorStream()));
this.log = log;
if (name != null)
this.name = name;
else
this.name = "OSCommandStreamsLogger";
}
private void logLine(BufferedReader reader, boolean isError) {
try {
String line = null;
while ((line = reader.readLine()) != null) {
if (log != null && log.isDebugEnabled()) {
if (!isError)
log.debug("[OuputStream] " + line);
else
log.warn("[ErrorStream] " + line);
} else
logger.debug(line);
}
} catch (Exception ex) {
if (log != null)
log.error(name + ":" + "Error while reading command process stream", ex);
}
}
public void run() {
while (!done) {
logLine(outputReader, false);
logLine(errorReader, true);
try {
// Sleep for a while before reading the next lines
Thread.sleep(100);
} catch (InterruptedException e) {
log.debug("Done with command");
}
}
// Process is done. Close all the streams
try {
logLine(outputReader, false);
outputReader.close();
logLine(errorReader, true);
errorReader.close();
if (log != null && log.isDebugEnabled())
log.debug(name + ": Closed output/ error Streams.");
} catch (IOException ie) {
if (log != null)
log.error(name + ":" + "Error while reading command process stream", ie);
}
}
public void stopLoggers() {
if (log != null && log.isDebugEnabled())
log.debug(name + ":Stop loggers");
this.done = true;
}
}
Usage:
Process p = Runtime.getRuntime().exec("Command");
OSCommandLogger logger = new OSCommandLogger(p, "Command", log);
// Start the thread using thread pool
threadExec.executeRunnable(logger);
int exitValue = p.waitFor(); // Wait till the process is finished
// Required to stop the logger threads
logger.stopLoggers();
logger.interrupt();

Categories

Resources