SOLVED:
This is what was wrong:
current.addFolder(folder); (in the final else clause of the if statement)
Added a new folder, but did not guarantee that the folder passed is the folder added, it may simply do nothing if the folder already exists, so to overcome this I changed addFolder to return the actual folder (for example if it already existed) and I assigned folder to that return value. And that did the trick, so now I've got:
folder = current.addFolder(folder);
current = folder;
Thanks a lot people, your help was much appreciated :)
This is going to be a very long post, hopefully you can understand what I'm talking about and I appreciate any help. Thanks
Basically, I've created a personal, non-commercial project (which I don't plan to release) that can read ZIP and RAR files. It can only read the contents in the archive, the folders inside, the files inside the folders and its properties (such as last modified date, last modified time, CRC checksum, uncompressed size, compressed size and file name). It can't extract files either, so it's really a ZIP/RAR viewer if you may.
Anyway that's slightly irrelevant to my problem but I thought I'd give you some background info.
Now for my problem:
I can successfully list all the folders and files inside a ZIP archive, so now I want to take that raw input and link it together in some useful way. I made 2 classes: ArchiveFile (represents a file inside a ZIP) and ArchiveFolder (represents a folder inside a ZIP). They both have some useful methods such as getLastModifiedDate, getName, getPath and so on. But the difference is that ArchiveFolder can hold an ArrayList of ArchiveFile's and additional ArchiveFolder's (think of this as files and folders inside a folder).
Now I want to populate my raw input into one root ArchiveFolder, which will have all the files in the root dir of the ZIP in the ArchiveFile's ArrayList and any additional folders in the root dir of the ZIP in the ArchiveFolder's ArrayList (and this process can continue on like this like a chain reaction (more files/folders in that ArchiveFolder etc etc).
So I came up with the following code:
while (archive.hasMore()) {
String path = "";
ArchiveFolder current = root;
String[] contents = archive.getName().split("/");
for (int x = 0; x < contents.length; ++x) {
if (x == (contents.length - 1) && !archive.getName().endsWith("/")) { // If on last item and item is a file
path += contents[x]; // Update final path ArchiveFile
file = new ArchiveFile(path, contents[x], archive.getUncompressedSize(), archive.getCompressedSize(), archive.getModifiedTime(), archive.getModifiedDate(), archive.getCRC());
current.addFile(file); // Create and add the file to the current ArchiveFolder
}
else if (x == (contents.length - 1)) { // Else if we are on last item and it is a folder
path += contents[x] + "/"; // Update final path
ArchiveFolder folder = new ArchiveFolder(path, contents[x], archive.getModifiedTime(), archive.getModifiedDate());
current.addFolder(folder); // Create and add this folder to the current ArchiveFile
}
else { // Else if we are still traversing through the path
path += contents[x] + "/"; // Update path
ArchiveFolder folder = new ArchiveFolder(path, contents[x]);
current.addFolder(folder); // Create and add folder (remember we do not know the modified date/time as all we know is the path, so we can deduce the name only)
current = folder; // Update current ArchiveFolder to the newly created one for the next iteration of the for loop
}
}
archive.getNext();
}
Assume that root is the root ArchiveFolder (initially empty).
And that archive.getName() returns the name of the current file OR folder in the following fashion: file.txt or folder1/file2.txt or folder4/folder2/ (this is a empty folder) etc. So basically the relative path from the root of the ZIP archive.
Please read through the comments in the above code to familiarize yourself with it. Also assume that the addFolder method in an ArchiveFile, only adds the folder if it doesn't exist already (so there are no multiple folders) and it also updates the time and date of an existing folder if it is blank (ie it was a intermediate folder we only knew the name of, but now we know its details). The code for addFolder is (pretty self-explanitory):
public void addFolder(ArchiveFolder folder) {
int loc = folders.indexOf(folder); // folders is the ArrayList containing ArchiveFolder's
if (loc == -1) {
folders.add(folder);
}
else {
ArchiveFolder real = folders.get(loc);
if (real.getTime() == null) {
real.setTime(folder.getTime());
real.setDate(folder.getDate());
}
}
}
So I can't see anything wrong with the code, it works and after finishing, the root ArchiveFolder contains all the files in the root of the ZIP as I want it to, and it contains all the direcories in the root folder as I want it to. So you'd think it works as expected, but no the ArchiveFolder's in the root folder don't contain the data inside those 'child' folders, it's just a blank folder with no additional files and folders (while it does really contain some more files/folders when viewed in WinZip).
After debugging using Eclipse, the for loop does iterate through all the files (even those not included above), so this led me to believe that there is a problem with this line of the code:
current = folder;
What it does is, it updates the current folder (used as an intermediate by the loop) to the newly added folder.
I thought Java passed by reference and thus all new operations and new additions in future ArchiveFile's and ArchiveFolder's are automatically updated, and parent ArchiveFolder's will be updated accordingly. But that does not appear to be the case?
I know this is a long ass post and I really hope anyone can help me out with this.
Thanks in advance.
Since you use eclipse, set a breakpoint and step through the method, it may take time but it helps with finding bugs. (check the object ids for example to see if the reference has changed).
Java does not actually pass references in the way you'd understand this in C++ for example. It passes by value, but all variables of non-primitive types are actually pointers to objects. So whenever you pass a variable to a method, you are giving or a copy of the pointer, meaning both variables point to the same object (change the object from one and the other will "see" the change. But assigning a different value to the pointer on caller or callee side will not change the other side's pointer.
Hope I'm clear?
I suspect you haven't overloaded equals() and hashCode() correctly on your ArchiveFolder class, and thus
folders.indexOf(folder)
in addFolder() is always returning -1.
Related
I have been searching for a way to get a file object from a file, in the resources folder. I have read a lot of similar questions on this website but non fix my problem exactly.
Link already referred to
how-to-get-a-path-to-a-resource-in-a-java-jar-file
that got really close to answering my question:
String path = this.getClass().getClassLoader().getResource(<resourceFileName>)
.toExternalForm()
I am trying to have a resource file that I can write data into and then bring that file object to another part of my program, I know I can technically create a temp file that, I then write data into then pass it into a part of my program, the problem with this approach is that I think it can take a lot of system recourses, my program will need to create a lot of these temp files.
Is there any way, I can reuse one file in the resource folder? all I need is to get it's path (and it needs to work in a jar).I have tried this snipper of code i created for testing, i don't really know why it returns false, because in the ide it returns true.
public File getFile(String fileName) throws FileNotFoundException {
//Getting file from the resources folder
ClassLoader classLoader = getClass().getClassLoader();
URL fileUrl = classLoader.getResource(fileName);
if (fileUrl == null)
throw new FileNotFoundException("Cannot find file " + fileName);
System.out.println("before: " + fileUrl.toExternalForm());
final String result = fileUrl.toExternalForm()
.replace("jar:" , "")
.replace("file:" , "");
System.out.println("after: " + result);
return new File(result);
}
Output:
before: jar:file:/C:/Users/%myuser%/Downloads/Untitlecd.jar!/Recording.wav
after: /C:/Users/%myuser%/Downloads/Untitlecd.jar!/Recording.wav
false
i have been searching for a way to get a file object from a file in the resources folder.
This is flat out impossible. The resources folder is going to end up jarred into your distribution, and you can't edit jar files, they are read only (or at least, you should consider them so. Non-idiotic deployments will generally mark their own code files (which includes those jars) as read-only to the running process. Even if not, editing jar files is extremely heavy and not something you want to do. Even if you do, on windows, open files can't be edited/replaced like this without significant headaches).
The 'resources' folder simply isn't designed for files that are meant to be modified.
The usual strategy is to make a directory someplace (for example, the user's home dir, accessing via System.getProperty("user.home"), and then make/edit files within that dir. If you wish, you can put templates in your resources folder and use those to 'initialize' that dir hanging off the user's home dir with a skeleton version.
If you have a few ten thousand files to make, whatever process needs this needs to be adjusted to not need this. For example, by using a database (H2, perhaps, if you want to ship it with your java app and have it be as low impact as possible).
in one requirement, i need to copy multiple files from one location to another network location.
let assume that i have the following files present in the /src location.
a.pdf, b.pdf, a.doc, b.doc, a.txt and b.txt
I need to copy a.pdf, a.doc and a.txt files atomically into /dest location at once.
Currently i am using Java.nio.file.Files packages and code as follows
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get("/dest/a.pdf");
Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get("/dest/a.doc");
Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get("/dest/a.txt");
Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);
but this process the file are copied one after another.
As an alternate to this, in order to make whole process as atomic,
i am thinking of zipping all the files and move to /dest and unzip at the destination.
is this approach is correct to make whole copy process as atomic ? any one experience similar concept and resolved it.
is this approach is correct to make whole copy process as atomic ? any one experience similar concept and resolved it.
You can copy the files to a new temporary directory and then rename the directory.
Before renaming your temporary directory, you need to delete the destination directory
If other files are already in the destination directory that you don't want to overwrite, you can move all files from the temporary directory to the destination directory.
This is not completely atomic, however.
With removing /dest:
String tmpPath="/tmp/in/same/partition/as/source";
File tmp=new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1 = Paths.get(tmpPath+"/dest/a.pdf");
Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2 = Paths.get(tmpPath+"/dest/a.doc");
Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3 = Paths.get(tmpPath+"/dest/a.txt");
Files.copy(srcFile1, destFile1);
Files.copy(srcFile2, destFile2);
Files.copy(srcFile3, destFile3);
delete(new File("/dest"));
tmp.renameTo("/dest");
void delete(File f) throws IOException {
if (f.isDirectory()) {
for (File c : f.listFiles())
delete(c);
}
if (!f.delete())
throw new FileNotFoundException("Failed to delete file: " + f);
}
With just overwriting the files:
String tmpPath="/tmp/in/same/partition/as/source";
File tmp=new File(tmpPath);
tmp.mkdirs();
Path srcFile1 = Paths.get("/src/a.pdf");
Path destFile1=paths.get("/dest/a.pdf");
Path tmp1 = Paths.get(tmpPath+"/a.pdf");
Path srcFile2 = Paths.get("/src/a.doc");
Path destFile2=Paths.get("/dest/a.doc");
Path tmp2 = Paths.get(tmpPath+"/a.doc");
Path srcFile3 = Paths.get("/src/a.txt");
Path destFile3=Paths.get("/dest/a.txt");
Path destFile3 = Paths.get(tmpPath+"/a.txt");
Files.copy(srcFile1, tmp1);
Files.copy(srcFile2, tmp2);
Files.copy(srcFile3, tmp3);
//Start of non atomic section(it can be done again if necessary)
Files.deleteIfExists(destFile1);
Files.deleteIfExists(destFile2);
Files.deleteIfExists(destFile2);
Files.move(tmp1,destFile1);
Files.move(tmp2,destFile2);
Files.move(tmp3,destFile3);
//end of non-atomic section
Even if the second method contains a non-atomic section, the copy process itself uses a temporary directory so that the files are not overwritten.
If the process aborts during moving the files, it can easily be completed.
See https://stackoverflow.com/a/4645271/10871900 as reference for moving files and https://stackoverflow.com/a/779529/10871900 for recursively deleting directories.
First there are several possibilities to copy a file or a directory. Baeldung gives a very nice insight into different possibilities. Additionally you can also use the FileCopyUtils from Spring. Unfortunately, all these methods are not atomic.
I have found an older post and adapt it a little bit. You can try using the low-level transaction management support. That means you make a transaction out of the method and define what should be done in a rollback. There is also a nice article from Baeldung.
#Autowired
private PlatformTransactionManager transactionManager;
#Transactional(rollbackOn = IOException.class)
public void copy(List<File> files) throws IOException {
TransactionDefinition transactionDefinition = new DefaultTransactionDefinition();
TransactionStatus transactionStatus = transactionManager.getTransaction(transactionDefinition);
TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronization() {
#Override
public void afterCompletion(int status) {
if (status == STATUS_ROLLED_BACK) {
// try to delete created files
}
}
});
try {
// copy files
transactionManager.commit(transactionStatus);
} finally {
transactionManager.rollback(transactionStatus);
}
}
Or you can use a simple try-catch-block. If an exception is thrown you can delete the created files.
Your question lacks the goal of atomicity. Even unzipping is never atomic, the VM might crash with OutOfMemoryError right in between inflating the blocks of the second file. So there's one file complete, a second not and a third entirely missing.
The only thing I can think of is a two phase commit, like all the suggestions with a temporary destination that suddenly becomes the real target. This way you can be sure, that the second operation either never occurs or creates the final state.
Another approach would be to write a sort of cheap checksum file in the target afterwards. This would make it easy for an external process to listen for creation of such files and verify their content with the files found.
The latter would be the same like offering the container/ ZIP/ archive right away instead of piling files in a directory. Most archives have or support integrity checks.
(Operating systems and file systems also differ in behaviour if directories or folders disappear while being written. Some accept it and write all data to a recoverable buffer. Others still accept writes but don't change anything. Others fail immediately upon first write since the target block on the device is unknown.)
FOR ATOMIC WRITE:
There is no atomicity concept for standard filesystems, so you need to do only single action - that would be atomic.
Therefore, for writing more files in an atomic way, you need to create a folder with, let's say, the timestamp in its name, and copy files into this folder.
Then, you can either rename it to the final destination or create a symbolic link.
You can use anything similar to this, like file-based volumes on Linux, etc.
Remember that deleting the existing symbolic link and creating a new one will never be atomic, so you would need to handle the situation in your code and switch to the renamed/linked folder once it's available instead of removing/creating a link. However, under normal circumstances, removing and creating a new link is a really fast operation.
FOR ATOMIC READ:
Well, the problem is not in the code, but on the operation system/filesystem level.
Some time ago, I got into a very similar situation. There was a database engine running and changing several files "at once". I needed to copy the current state, but the second file was already changed before the first one was copied.
There are two different options:
Use a filesystem with support for snapshots. At some moment, you create a snapshot and then copy files from it.
You can lock the filesystem (on Linux) using fsfreeze --freeze, and unlock it later with fsfreeze --unfreeze. When the filesystem is frozen, you can read the files as usual, but no process can change them.
None of these options worked for me as I couldn't change the filesystem type, and locking the filesystem wasn't possible (it was root filesystem).
I created an empty file, mount it as a loop filesystem, and formatted it. From that moment on, I could fsfreeze just my virtual volume without touching the root filesystem.
My script first called fsfreeze --freeze /my/volume, then perform the copy action, and then called fsfreeze --unfreeze /my/volume. For the duration of the copy action, the files couldn't be changed, and so the copied files were all exactly from the same moment in time - for my purpose, it was like an atomic operation.
Btw, be sure to not fsfreeze your root filesystem :-). I did, and restart is the only solution.
DATABASE-LIKE APPROACH:
Even databases cannot rely on atomic operations, and so they first write the change to WAL (write-ahead log) and flush it to the storage. Once it's flushed, they can apply the change to the data file.
If there is any problem/crash, the database engine first loads the data file and checks whether there are some unapplied transactions in WAL and eventually apply them.
This is also called journaling, and it's used by some filesystems (ext3, ext4).
I hope this solution would be useful : as per my understanding you need to copy the files from one directory to another directory.
so my solution is as follows:
Thank You.!!
public class CopyFilesDirectoryProgram {
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
String sourcedirectoryName="//mention your source path";
String targetdirectoryName="//mention your destination path";
File sdir=new File(sourcedirectoryName);
File tdir=new File(targetdirectoryName);
//call the method for execution
abc (sdir,tdir);
}
private static void abc(File sdir, File tdir) throws IOException {
if(sdir.isDirectory()) {
copyFilesfromDirectory(sdir,tdir);
}
else
{
Files.copy(sdir.toPath(), tdir.toPath());
}
}
private static void copyFilesfromDirectory(File source, File target) throws IOException {
if(!target.exists()) {
target.mkdir();
}else {
for(String items:source.list()) {
abc(new File(source,items),new File(target,items));
}
}
}
}
I want to be able to iterate through a package of files as if the package were a folder.
Something like the below (scripts being the java package):
File scriptFolder = new File("scripts").getAbsoluteFile();
The packages appear are not being treated like folders. If I hardcode the path C:\Users\...\project_folder\...\scripts the File.isFile() method returns false for the package. If I do new File (C:\Users\...\project_folder\...\scripts\script).isFile() I get true.
I want to get a File of the folder so I can get a list of the files in the folder and iterate through it.
The .isFile() method returns true only if you are referencing a plain jane normal file. If you're referencing a directory, it'd return false. Try .isDirectory() or possibly .exists().
Or don't; there's no real need:
File[] filesInDir = new File("C:\\Users\\....\\scripts").listFiles();
if (filesInDir == null) {
// this means it wasn't a directory or didn't exist or isn't readable
} else {
for (File child : filesInDir) {
// called for each file in dir
}
}
The official javadocs say this about File#isFile():
Tests whether the file denoted by this abstract pathname is a normal file. A file is normal if it is not a directory and, in addition, satisfies other system-dependent criteria. Any non-directory file created by a Java application is guaranteed to be a normal file.
You can check if it is a directory with File#isDirectory(), then if it is, you can list its contents with File#listFiles().
Unless I'm missing something in your question C:\Users...\project_folder...\scripts is a directory so isFile() will return false because it is not a file.
I have two folders, source and target, with files and possible subfolders(directory structure is assumed to be the same, subfolders and files can go into any depth) in them. We want to synchronize the target so that for all files:
Exists in source, but not in target -> new, copy over to target
Exists in target, but not in source -> deleted, delete from the target
Exists in both, but binary unequal -> changed, copy over from source
Exists in both, and is binary equal -> unchanged, leave be
One problem I have with this is checking for existence of a file(the return value of listFiles() doesn't seem to have contains() defined), but a far bigger obstacle is referencing the other directory structure. For example, how would I check if target folder contains file "foo.txt" while iterating through the source folder and finding it there? Here's what I have so far:
public void synchronize(File source, File target) {
//first loop; accounts for every case except deleted
if (source.isDirectory()) {
for (File i : source.listFiles()) {
if (i.isDirectory()) {
synchronize(i, /**i's equivalent subdirectory in target*/);
}
else if (/**i is new*/) {
/**Copy i over to the appropriate target folder*/
}
else if (/**i is different*/) {
/**copy i over from source to target*/
}
else {/**i is identical in both*/
/**leave i in target alone*/
}
}
for (File i : target.listFiles()) {
if (/**i exists in the target but not in source*/) {
/**delete in target*/
}
}
}
}
EDIT(important): I thank you guys for all the answers, but the main problem remains unsolved: referring to the other directory, i.e. the stuff in the comments. h22's answer seem to be somewhere in the ballpark, but it's not sufficient, as explained in the comment below it. I'd be very grateful if someone could explain this in even smaller words. From experience, this is exactly the kind of problem that someone more java-savvy could solve in five minutes, whereas I would spend two frustrating weeks rediscovering America.
As wero points out, you can use aFile.exists() to see if a given path exists. You should also combine it with aFile.isFile() to check whether the path is a normal file (and not, say, a folder).
Checking content-equals is more tricky. I propose the following:
boolean sameContents(File fa, File fb) throws IOException {
Path a = a.toPath();
Path b = b.toPath();
if (Files.size(a) != Files.size(b)) return false;
return Arrays.equals(
Files.readAllBytes(a), Files.readAllBytes(b));
}
But only if the files are expected to be small; otherwise you could run out of memory trying to compare them in one go (required to use Arrays.equals). If you have large files in there, this answer proposes Apache Commons IO's FileUtils.contentEquals().
Note that both the above code and contentEquals only compare files, and not folders. To compare folders, you will need to use recursion, calling sameContents or equivalent on each same-named, same-sized file, and erroring out if no match is found for a particular pathname either in source or in destination.
Only visit the source folder recursively. Strip the folder root and address the target location directly:
String subPath = sourceFile.getAbsolutePath().substring(sourceRoot.length);
File targetFile = new File(targetRoot + File.separator + subPath);
if (targetFile.getParentFile().exists()) {
targetFile.getParentFile().mkdirs();
}
// copy, etc
Otherwise you may have difficulties if the target location is missing the required hierarchical folder structure that may go many directories in depth.
If you have a target directory File targetDir and a source file File sourceFile in a source directory you can check the existence of the corresponding target file by writing:
File targetFile = new File(targetDir, sourceFile.getName());
boolean exists = targetFile.exists();
I'm using a shell script to automatically create a zipped backup of various directories every hour. If I haven't been working on any of them for quite some time, this creates alot of duplicate archives. MD5 hashes of the files don't match, because they do have different filenames & creation dates etc.
Other than making sure there won't be duplicates in the first place, another option is checking if filesizes match, but that doesn't necesseraly mean they are duplicates.
Filenames are done like so;
Qt_2012-03-15_23_00.tgz
Qt_2012-03-16_00_00.tgz
So maybe it would be an option to check if files have identical filesizes consequently (if that's the right word for it.)
Pseudo code:
int previoussize = 0;
String previouspath = null;
String Filename = null;
String workDir = "/path/to/workDir ";
String processedDir = "/path/to/processedDir ";
//Loop over all files
for file in workDir
{
//Match
if(file.size() == previoussize)
{
if(previouspath!=null) //skip first loop
{
rm previouspath; //Delete file
}
}
else //No Match
{
/*If there's no match, we can move the previous file
to another directory so it doesn't get checked again*/
if(previouspath!=null) //skip first loop
{
mv previouspath processedDir/Filename;
}
}
previoussize = file.size();
previouspath = file.path();
Filename = file.name();
}
Example:
Qt_2012-03-15_23_00.tgz 10KB
Qt_2012-03-16_00_00.tgz 10KB
Qt_2012-03-16_01_00.tgz 10KB
Qt_2012-03-16_02_00.tgz 15KB
Qt_2012-03-16_03_00.tgz 10KB
Qt_2012-03-16_04_00.tgz 10KB
If I'm correct this would only delete the first 2 and the second to last one. The third and the fourth should be moved to the processedDir.
So I guess I have 2 questions:
Would my pseudo code work the way I intend it to? (I find these things rather confusing.)
Is there a better/simpler/faster way? Because even though the chance of accidentally deleting non-identicals like that is very small, it's still a chance.
I can think of a couple of alternatives:
Deploy a version control system such as Git, Subversion, etc, and write a script that periodically checks in any changes. This will save a lot of space because only files that have actually changed get saved, and because changes to text files will be stored as diffs.
Use an incremental backup tool. This article lists a number of alternatives.
Normal practice is to put the version control system / backups on a different machine, but you don't have to do that.
Not clear if this need to run as a batch. If it's manual, you can run BeyondCompare or any decent comparison tool to diff the two archives