In my knowledge when you delete a file the pointer of the file just get removed but the data of the file is still there ready to get overwrite and in Linux there is Shred that overwrite the file for 3 times by default
My question is how to write a code that does the same as shred but I don't know any keywords to use in my search and I never found source code of shred or any thing like it
Related
I am currently implementing my own version of WatchService API in Java.
(You can refer to this example to understand what file WatchService does.)
I have implemented code for the following cases :
Whenever a file has been created inside the directory or its sub-directory.
Whenever a file has been deleted from the directory or its sub-directory.
Whenever a file has been modified inside the directory or its sub-directory. (based on the comparison of the last-modified date of files)
I am facing a problem, whenever a file has been renamed, I don't understand how to track it.
This link contains the gist of my code at high level: click here
Whenever a file has been renamed, I get 2 results
FILE DELETED : {old filename}
FILE ADDED : {new filename}
But what I want the actual result to be is :
FILE RENAMED: FROM {old filename} TO {new filename}
How do I tackle this challenge ?
I have provided a psuedo-code (which I think is enough to understand the problem at hand), If required I can provide the whole code too :)
You can store hash for each file.
When you detect that some files are deleted and some added compare hashes and if they match you found renamed/moved file.
Whenever a file is renamed, watcher service generate delete + create events. You should some how merge these two events as rename event.
You can create a unique id and assign it to the file. Refer this link. Whenever you see a create event on a file and it has already some id, that mean it is renamed, else it is new creation. For newly created files, you should set the random unique id to keep track rename events.
JNotify supports JNotify.FILE_RENAMED event. As far as I understand this is a wrapper over native inotify API and it can catch file renaming at system level (changing name without changing file descriptor).
I have an app that accesses words from a csv text files. Since they usually do not change I have them placed inside a .jar file and read them using .getResourceAsStream call. I really like this approach since I do not have to place a bunch of files onto a user's computer - I just have one .jar file.
The problem is that I wanted to allow "admin" to add or delete the words within the application and then send the new version of the app to other users. This would happen very rarely (99.9% only read operations and 0.1% write). However, I found out that it is not possible to write to text files inside the .jar file. Is there any solution that would be appropriate for what I want and if so please explain it in detail as I'm still new to Java.
It is not possible because You can't change any content of a jar which is currently used by a JVM.
Better Choose alternate solution like keeping your jar file and text file within the same folder
I have a spark cluster with 2 machines say mach-1 and mach-2.
I code on my local and then export it to JAR, and copy it to mach-1.
Then i run the code on mach-1 using spark-submit.
The code tries to read a local file, which exists on mach-1.
It works well most of the time, but sometimes it gave me errors like File does not exist. So, i then copied the file to mach-2 as well, and now the code works.
Similarly, while writing out the file to local, sometimes it worked when the output folder was only available on mach-1, but then it gave an error, and i created the output folder on mach-2 as well. Now it creates the output in both mach-1 and mach-2 (some part in mach-1 and some part in mach-2).
Is this expected behavior? any pointers to texts explaining this.
P.S: i do not collect my RDDs before writing to local file (I do it in foreach). If i do that, the code works well with output folder only being present on mach-1.
Your input data has to exist at every Node. You can achieve this by copy the data to the nodes, using NFS or HDFS.
For your output you can write to NFS or HDFS. Or you call collect(), but only do it, when your Dataset does fit into the Memory of the Driver. When it doesn't fit you should call rdd.toLocalIterator() or take(n).
Is it possible, that you run your code in Cluster Mode and not in Client Mode?
So I'm putting together an RSS parser which will process an RSS feed, filter it, and then download the matched items. Assume that the files being downloaded are legal torrent files.
Now I need to keep a record of the files that I have already downloaded, so they aren't done again.
I've already got it working with SQLite (create database if not exists, insert row if a select statement returns nothing), but the resulting jar file is 2.5MB+ (due to the sqlite libs).
I'm thinking that if I use a text file, I could cut down the jar file to a few hundred kilobytes.
I could keep a list of the names of files downloaded - one per line - and reading the whole file into memory, search if a file exists, etc.
The few questions that occur to me know:
Say if 10 files are downloaded a day, would the text file method end
up taking too much resources?
Overall which one is faster
Anyway, what do you guys think? I could use some advice here, as I'm still new to programming and doing this as a hobby thing :)
If you need to keep track only of few informations (like name of the file), you can for sure use a simple text file.
Using a BufferedReader to read you should achieve good performance.
Theoretically DB (either relational or NoSQL is better. But if the distribution size is critical for you using file system can be preferable.
The only problem here is the performance of data access (either for write or for read). Probably think about the following approach. Do not use one single file. Use directory that contains several files instead. The file name will contain key (or keys) that allow access specific data just like key in map. In this case you will be able to access data relatively easily and fast.
Probably take a look on XStream. They have implementation of Map that is implemented as described above: stores entries on disk, each entry in separate file.
I am developing a Java Desktop Application. This app needs a configuration to be started. For this, I want to supply a defaultConfig.properties or defaultConfig.xml file with the application so that If user doesn't select any configuration, then the application will start with the help of defaultConfig file.
But I am afraid of my application crash if the user accidentally edit the defaultConfig file. So Is there any mechanism through which I can check before the start of the application that whether the config file has changed or not.
How other applications (out in the market) deal with this type of situation in which their application depends on a configuration file?
If the user edited the config file accidentally or intentionally, then the application won't run in future unless he re-installs the application.
I agree with David in that using a MD5 hash is a good and simple way to accomplish what you want.
Basically you would use the MD5 hashing code provided by the JDK (or somewhere else) to generate a hash-code based on the default data in Config.xml, and save that hash-code to a file (or hardcode it into the function that does the checking). Then each time your application starts load the hash-code that you saved to the file, and then load the Config.xml file and again generate a hash-code from it, compare the saved hash-code to the one generated from the loaded config file, if they are the same then the data has not changed, if they are different, then the data has been modified.
However as others are suggesting if the file should not be editable by the user then you should consider storing the configuration in a manner that the user can not easily edit. The easiest thing I can think of would be to wrap the Output Stream that you are using to write the Config.xml file in a GZIP Output Stream. Not only will this make it difficult for the user to edit the configuration file, but it will also cause the Config.xml file to take up less space.
I am not at all sure that this is a good approach but if you want to go ahead with this you can compute a hash of the configuration file (say md5) and recompute and compare every time the app starts.
Come to think of it, if the user is forbidden to edit a file why expose it? Stick it in a jar file for example, far away from the user's eyes.
If the default configuration is not supposed to be edited, perhaps you don't really want to store it in a file in the first place? Could you not store the default values of the configuration in the code directly?
Remove write permissions for the file. This way the user gets a warning before trying to change the file.
Add a hash or checksum and verify this before loading file
For added security, you can replace the simple hash with a cryptographic signature.
From I have found online so far there seems to be different approaches code wise. none appear to be a 100 hundred percent fix, ex:
The DirectoryWatcher implements
AbstractResourceWatcher to monitor a
specified directory.
Code found here twit88.com develop-a-java-file-watcher
one problem encountered was If I copy
a large file from a remote network
source to the local directory being
monitored, that file will still show
up in the directory listing, but
before the network copy has completed.
If I try to do almost anything non
trivial to the file at that moment
like move it to another directory or
open it for writing, an exception will
be thrown because really the file is
not yet completely there and the OS
still has a write lock on it.
found on the same site, further below.
How the program works It accepts a ResourceListener class, which is FileListener. If a change is detected in the program a onAdd, onChange, or onDelete event will be thrown and passing the file to.
will keep searching for more solutions.