i know this may get a down vote this is bothering me a lot
i have already read all posts on .close() method like
explain the close() method in Java in Layman's terms
Why do I need to call a close() or shutdown() method?
the usage of close() method(Java Beginner)
i have these questions which may seem too trivial
1.what does the word 'resource' exactly mean (is it the file or the 'FileWriter' object or some other thing)(try to explain as broadly as possible)
lets consider following code
import java.io.*;
public class characterstreams
{
public static void main(String []args) throws Exception
{
File f=new File("thischaracter.txt");
FileWriter fw=new FileWriter(f);
char[] ch={'a','c','d'};
fw.write('a');
fw.write(ch);
fw.write("aaaa aaaaa aaaaaaa");
fw.flush();
FileReader fr=new FileReader(f);
int r=fr.read();
System.out.println(r);
char[] gh=new char[30];
System.out.println(fr.read(gh));
}
}
after compiling and executing it
G:/>java characterstreams
lets say resource is FileWriter below (since i have yet to get the meaning of resources)
JVM starts and opens up so-called resources and then execution completes and after which JVM get shuts down after execution
2.it unlocks the resource that it has opened since it's not running right (correct me if i am wrong)
G:/>
at this point JVM is not running
3.before shuting down , garbage collector is called right ?(correct me if am wrong) so FileWriter objects get destroyed
then why are we supposed to close all the resources that we have opened up
and
4.again i read that 'resources get leaked' what does this supposed to mean..?
resource means anything which is needed by the JVM and/or operating system to provide you with the functionality you request.
Taken your example. If you open a FileWriter the operating system in general (depends on the operating system, file system, etc.) will do (assuming you want to write a file to a disc, like HDD/SDD)
create a directory entry for the requested filename
create a data structure to maintain the writing process to the file
allocate disc space if you actually write data to the file
(note: this is not an exhaustive list)
The point will be done for any file you open for writing. If you don't close the resource all this remains in the memory and is still maintained by the operating system.
Assume your application is running over a long time and is constantly opening files. The number of open file the operating system allows you to keep open is limited (the concrete number depends on the operating system, quota settings, ...). If the resources are exhausted something will behave unexpected or fail.
Find below a small demonstration on Linux.
public static void main(String[] args) throws IOException {
List<OutputStream> files = new ArrayList<>();
for (int i = 0; i < 1000; i++) {
files.add(Files.newOutputStream(Paths.get("/tmp/demo." + i),
StandardOpenOption.CREATE));
}
}
The code open one-thousend file for writing.
Assume your limit of open files is 1024
ulimit -n
1024
you run the snippet and it will generate 1000 files /tmp/demo.*.
If your limit of open files is only 100 the code will fail
ulimit -n 100
java.nio.file.FileSystemException: /tmp/demo.94: Too many open files
(it fails before as the JVM itself has some open files)
To prevent such problems (lack of resources) you should close files which you don't need any longer to write to. If you don't do it in Java(close()) the operating system also doesn't know if the memory etc. can be freed and used for another request.
Related
I am trying to parse a pcap file two different ways by using two different methods. The pcap file is passed to the class when it is created that contains both methods. When i use the pcap file in the first method no problem looping. However, when i go to parse through it a second time in the second method nothing happens when i try to print each packet. I tried passing the pcap file directly to the second method and still no dice. Do I need to reset a counter/pointer? Any ideas?
How pcap file is loaded from disk
pcap = Pcap.openStream(pcapPath);
How class constructor intakes pcap file
public PcapParsing(Pcap pcap) {
this.pcap = pcap;
}
How both methods parse the pcap file
public void arpFloodDetect(Pcap ppcap)
{
try {
ppcap.loop((final Packet packet) -> {
System.out.println(ppcap.toString());
return true;
});
} catch (IOException e) {
e.printStackTrace();
}
}
Do I need to reset a counter/pointer?
You need to create a new Pcap by calling Pcap.openStream again. The Pcap API does not expose any methods for resetting the underlying stream.
Pcap files can get large like a couple gigs or larger. Will this add a significant load penalty for each time i call it?
It depends on how good your file system is. If we assume that your file system is on a fast local SSD, and you are running an OS which uses RAM for file system buffer caching, then the reading a big file will be fast the first time, and faster the second time.
It also depends on what you mean by "significant", and what is acceptable. And how much money you are prepared to pay to upgrade your hardware to achieve acceptable performance.
Would you happen to know a different way of loading files that avoids a penalty if there is one?
Basically, no.
The only other alternatives I can think of involve read or mapping the entire file into the JVM's address space and then wrapping it in an InputStream. You still need to create a Pcap for each pass through the file.
But the problem with this is that it requires as much JVM address space as the size of the file you are processing. If the file is significantly bigger than the amount of physical RAM available, it can get horrible:
In the best case your performance will be equivalent to re-reading the file from disk.
In the worst case your application thrashes and brings the operating system to its knees (or gets OOM-killed to prevent that).
The current Pcap implementation is designed to avoid that by not caching the data in RAM. That is how it is able to cope with huge input files without running out of memory, etc.
I'm doing some file I/O with multiple files (writing to 19 files, it so happens). After writing to them a few hundred times I get the Java IOException: Too many open files. But I actually have only a few files opened at once. What is the problem here? I can verify that the writes were successful.
On Linux and other UNIX / UNIX-like platforms, the OS places a limit on the number of open file descriptors that a process may have at any given time. In the old days, this limit used to be hardwired1, and relatively small. These days it is much larger (hundreds / thousands), and subject to a "soft" per-process configurable resource limit. (Look up the ulimit shell builtin ...)
Your Java application must be exceeding the per-process file descriptor limit.
You say that you have 19 files open, and that after a few hundred times you get an IOException saying "too many files open". Now this particular exception can ONLY happen when a new file descriptor is requested; i.e. when you are opening a file (or a pipe or a socket). You can verify this by printing the stacktrace for the IOException.
Unless your application is being run with a small resource limit (which seems unlikely), it follows that it must be repeatedly opening files / sockets / pipes, and failing to close them. Find out why that is happening and you should be able to figure out what to do about it.
FYI, the following pattern is a safe way to write to files that is guaranteed not to leak file descriptors.
Writer w = new FileWriter(...);
try {
// write stuff to the file
} finally {
try {
w.close();
} catch (IOException ex) {
// Log error writing file and bail out.
}
}
1 - Hardwired, as in compiled into the kernel. Changing the number of available fd slots required a recompilation ... and could result in less memory being available for other things. In the days when Unix commonly ran on 16-bit machines, these things really mattered.
UPDATE
The Java 7 way is more concise:
try (Writer w = new FileWriter(...)) {
// write stuff to the file
} // the `w` resource is automatically closed
UPDATE 2
Apparently you can also encounter a "too many files open" while attempting to run an external program. The basic cause is as described above. However, the reason that you encounter this in exec(...) is that the JVM is attempting to create "pipe" file descriptors that will be connected to the external application's standard input / output / error.
For UNIX:
As Stephen C has suggested, changing the maximum file descriptor value to a higher value avoids this problem.
Try looking at your present file descriptor capacity:
$ ulimit -n
Then change the limit according to your requirements.
$ ulimit -n <value>
Note that this just changes the limits in the current shell and any child / descendant process. To make the change "stick" you need to put it into the relevant shell script or initialization file.
You're obviously not closing your file descriptors before opening new ones. Are you on windows or linux?
Although in most general cases the error is quite clearly that file handles have not been closed, I just encountered an instance with JDK7 on Linux that well... is sufficiently ****ed up to explain here.
The program opened a FileOutputStream (fos), a BufferedOutputStream (bos) and a DataOutputStream (dos). After writing to the dataoutputstream, the dos was closed and I thought everything went fine.
Internally however, the dos, tried to flush the bos, which returned a Disk Full error. That exception was eaten by the DataOutputStream, and as a consequence the underlying bos was not closed, hence the fos was still open.
At a later stage that file was then renamed from (something with a .tmp) to its real name. Thereby, the java file descriptor trackers lost track of the original .tmp, yet it was still open !
To solve this, I had to first flush the DataOutputStream myself, retrieve the IOException and close the FileOutputStream myself.
I hope this helps someone.
If you're seeing this in automated tests: it's best to properly close all files between test runs.
If you're not sure which file(s) you have left open, a good place to start is the "open" calls which are throwing exceptions! 😄
If you have a file handle should be open exactly as long as its parent object is alive, you could add a finalize method on the parent that calls close on the file handle. And call System.gc() between tests.
Recently, I had a program batch processing files, I have certainly closed each file in the loop, but the error still there.
And later, I resolved this problem by garbage collect eagerly every hundreds of files:
int index;
while () {
try {
// do with outputStream...
} finally {
out.close();
}
if (index++ % 100 = 0)
System.gc();
}
I'm doing some file I/O with multiple files (writing to 19 files, it so happens). After writing to them a few hundred times I get the Java IOException: Too many open files. But I actually have only a few files opened at once. What is the problem here? I can verify that the writes were successful.
On Linux and other UNIX / UNIX-like platforms, the OS places a limit on the number of open file descriptors that a process may have at any given time. In the old days, this limit used to be hardwired1, and relatively small. These days it is much larger (hundreds / thousands), and subject to a "soft" per-process configurable resource limit. (Look up the ulimit shell builtin ...)
Your Java application must be exceeding the per-process file descriptor limit.
You say that you have 19 files open, and that after a few hundred times you get an IOException saying "too many files open". Now this particular exception can ONLY happen when a new file descriptor is requested; i.e. when you are opening a file (or a pipe or a socket). You can verify this by printing the stacktrace for the IOException.
Unless your application is being run with a small resource limit (which seems unlikely), it follows that it must be repeatedly opening files / sockets / pipes, and failing to close them. Find out why that is happening and you should be able to figure out what to do about it.
FYI, the following pattern is a safe way to write to files that is guaranteed not to leak file descriptors.
Writer w = new FileWriter(...);
try {
// write stuff to the file
} finally {
try {
w.close();
} catch (IOException ex) {
// Log error writing file and bail out.
}
}
1 - Hardwired, as in compiled into the kernel. Changing the number of available fd slots required a recompilation ... and could result in less memory being available for other things. In the days when Unix commonly ran on 16-bit machines, these things really mattered.
UPDATE
The Java 7 way is more concise:
try (Writer w = new FileWriter(...)) {
// write stuff to the file
} // the `w` resource is automatically closed
UPDATE 2
Apparently you can also encounter a "too many files open" while attempting to run an external program. The basic cause is as described above. However, the reason that you encounter this in exec(...) is that the JVM is attempting to create "pipe" file descriptors that will be connected to the external application's standard input / output / error.
For UNIX:
As Stephen C has suggested, changing the maximum file descriptor value to a higher value avoids this problem.
Try looking at your present file descriptor capacity:
$ ulimit -n
Then change the limit according to your requirements.
$ ulimit -n <value>
Note that this just changes the limits in the current shell and any child / descendant process. To make the change "stick" you need to put it into the relevant shell script or initialization file.
You're obviously not closing your file descriptors before opening new ones. Are you on windows or linux?
Although in most general cases the error is quite clearly that file handles have not been closed, I just encountered an instance with JDK7 on Linux that well... is sufficiently ****ed up to explain here.
The program opened a FileOutputStream (fos), a BufferedOutputStream (bos) and a DataOutputStream (dos). After writing to the dataoutputstream, the dos was closed and I thought everything went fine.
Internally however, the dos, tried to flush the bos, which returned a Disk Full error. That exception was eaten by the DataOutputStream, and as a consequence the underlying bos was not closed, hence the fos was still open.
At a later stage that file was then renamed from (something with a .tmp) to its real name. Thereby, the java file descriptor trackers lost track of the original .tmp, yet it was still open !
To solve this, I had to first flush the DataOutputStream myself, retrieve the IOException and close the FileOutputStream myself.
I hope this helps someone.
If you're seeing this in automated tests: it's best to properly close all files between test runs.
If you're not sure which file(s) you have left open, a good place to start is the "open" calls which are throwing exceptions! 😄
If you have a file handle should be open exactly as long as its parent object is alive, you could add a finalize method on the parent that calls close on the file handle. And call System.gc() between tests.
Recently, I had a program batch processing files, I have certainly closed each file in the loop, but the error still there.
And later, I resolved this problem by garbage collect eagerly every hundreds of files:
int index;
while () {
try {
// do with outputStream...
} finally {
out.close();
}
if (index++ % 100 = 0)
System.gc();
}
I wrote a java program in Eclipse which writes 30 million lines to a file.
The first time i ran this code, the time taken to write to the textfile(foo.txt) took approximately 104 seconds.
I deleted the textfile(foo.txt), which i had written the lines to, and ran the program again. This time it took 61 seconds.
I continued this process and the time taken for writing to the file kept decreasing each time i ran the program. The recorded time to write to the file turned out to be as follows:
(In seconds,approx values)
104->61->39->25->18->16->16->16->...
What i observed is that the time taken to write to the textfile(foo.txt) kept decreasing until it became constant at around 16 seconds.
My java code is as follows:
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.util.ArrayList;
import java.util.List;
public class fileWrite {
private static int file_sz = 30000000;
private static final String line = "Help I am a chinese guy trapped in a fortune cookie factory!!";
/*
No offense meant to any Chinese person, i apologise in advance if i have hurt your feelings.
*/
private static void write(List<String> list, Writer writer)throws IOException {
long start = System.currentTimeMillis();
for(String list_el: list){
writer.write(list_el);
}
writer.flush();
writer.close();
long end = System.currentTimeMillis();
System.out.println((end-start)/1000f + "seconds");
}
public static void main(String[] args) {
try{
File file = new File("foo.txt");
if(!file.exists()){
file.createNewFile();
}
FileWriter writer = new FileWriter(file.getAbsolutePath());
List<String> records = new ArrayList<String>(file_sz);
for(int i = 0;i<file_sz;++i){
records.add(line);
}
write(records,writer);
}
catch(Exception ex){
ex.printStackTrace();
}
}
}
Questions i would like to ask are:
Why did the time taken to write to the file become constant?
Is the time decrease related to the cache somehow?
I would be really grateful if someone could shed light on what is happening behind the scenes. Any links which explain the working of the system in detail would be welcome too.
Thank you in advance.
That's probably your Operating System and specifically your file system doing its job.
File systems represent files as a series of blocks or extents; this way, files don't have to continuously fit on your storage medium.
First time you wrote your file, your file system started with the first free block it could find, and when that was written to, got the next free one, added it to the list of blocks in your file, and so on.
As the file grew, the file system gave up on finding blocks between other blocks, but got a continuous chunk of free space on your medium, and just always appended the next block to your file. That both reduces file system overhead as well as, in the case of hard drives, reduces latency due to the write header slowly going to a new position.
Now, after you delete your original file, the file system-internal pointer to "first free block" might still be in the area of contiguous free space.
Also, modern operating systems might be smart and understand that your program always opens one file in a specific folder for access, and puts much data there, and hence might optimize the way the file system works.
The most likely thing to happen is that both Java and your OS have write caches in RAM, which store data written to a file before/while it is actually written to disk. These caches are elastic; as you write much data, the Operating system takes up more free RAM for write caching (e.g. away from read caches). After your program finishes, the write cache is no longer needed -- but since it's not used in any other way, the next time you write a large file, the Operating System can very quickly assign that memory to a write cache.
Every 5 seconds (for example), a server checks if files have been added to a specific directory. If yes, it reads and processes them. The concerned files can be quite big (100+ Mo for example), so copying/uploading them to the said directory can be quite long.
What if the server tries to access a file that hasn't finished being copied/uploaded? How does JAVA manage these concurrent accesses? Does it depend on the OS of the server?
I made a try, copying a ~1300000-line TXT file (i.e. about 200 Mo) from a remote server to my local computer: it takes about 5 seconds. During this lapse, I run the following JAVA class:
public static void main(String[] args) throws Exception {
String local = "C:\\large.txt";
BufferedReader reader = new BufferedReader(new FileReader(local));
int lines = 0;
while (reader.readLine() != null)
lines++;
reader.close();
System.out.println(lines + " lines");
}
I get the following exception:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
at java.lang.StringBuffer.append(StringBuffer.java:306)
at java.io.BufferedReader.readLine(BufferedReader.java:345)
at java.io.BufferedReader.readLine(BufferedReader.java:362)
at main.Main.main(Main.java:15)
When running the class once the file has finished being copied, I get the expected output (i.e. 1229761 lines), so the exception isn't due to the size of the file (as we could think in the first place). What is JAVA doing in background, that threw this OutOfMemoryError exception?
How does JAVA manage these concurrent accesses? Does it depend on the OS of the server?
It depends on the specific OS. If you run a copy and server in a single JVM AsynchronousFileChannel (new in 1.7) class could be of a great help. However, if client and server are represented by different JVMs (or even more, are started on a different machines) it all turns to be a platform specific.
From JavaDoc for AsynchronousFileChannel:
As with FileChannel, the view of a file provided by an instance of this class is guaranteed to be consistent with other views of the same file provided by other instances in the same program. The view provided by an instance of this class may or may not, however, be consistent with the views seen by other concurrently-running programs due to caching performed by the underlying operating system and delays induced by network-filesystem protocols. This is true regardless of the language in which these other programs are written, and whether they are running on the same machine or on some other machine. The exact nature of any such inconsistencies are system-dependent and are therefore unspecified.
Why are you using a buffered reader just to count the lines?
From the javadoc:
Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.
This means it will "buffer", ie. save, that entire file in memory which causes your stack dump. Try a FileReader.