Facing issue while chunking then merging the jar files

Facing issue while chunking then merging the jar files - java

I have one jar file for example apache-cassandra-3.11.6.jar.
Firstly i split/chunked into mutiple jars like below :
apache-cassandra1.jar
apache-cassandra2.jar
apache-cassandra3.jar
apache-cassandra4.jar
apache-cassandra5.jar
apache-cassandra6.jar
Then i reassemble them again into new Jar file i.e apache-cassandra_Merged.jar.
Now the problem comes.
When i compare the original jar file i.e apache-cassandra-3.11.6.jar with new Jar file i.e apache-cassandra_Merged.jar. then it is not matching.
The newly created jar file which is apache-cassandra_Merged.jar, it's size also reduced.
Please find below my code for your reference :
/// Chunking/spliting into mutiple jars
Path path = Paths.get("/Original_Jar/apache-cassandra-3.11.6.jar");
byte [] data = Files.readAllBytes(path); // Will read all bytes at once
Now divide total bytes into equal part and then write in each small jars one by one.
int count = 0;
for(byte[] rangeData : Arrays.copyOfRange(data, rangeSTART, rangeEND)){
FileOutputStream fileOutputStream1 = new FileOutputStream("/Cassandra_Image/Chunked_Jar/apache-cassandra"+count+".jar");
fileOutputStream1.write(rangeData);
}
//Merging back to one jar
For merging i used the same way. Created array of byte for each small/chunked jars and written into FileOutputStream("/Merged_Jar/apache-cassandra_Merged.jar") one by one.
Please let me know if i should use some other method/algorithm to split jar and reassemble it again which will make sure the originality of data after chunking and merging as well.
Note : Actually i want to transfer the jars to any server/directory where i should transfer a jar with limited size so for big size jars i need to split into small jars and send them one by one and then again reassemble them in target directory/place and it should be as original jar.
Thanks in advance.

This may not be the answer, but I provide as an information for you. Java also provides pack format where you can compress the jar files and then you can uncompress using unpack.
The tool is called pack200.
How to compress
<java_location>...\jre\lib>pack200 -J-Xmx256m small.jar.gz big.jar
How to uncompress
<java_location>...\jre\lib>unpack200 small.jar.gz big.jar
You can refer the following links.
https://docs.oracle.com/javase/1.5.0/docs/tooldocs/share/pack200.html
https://docs.oracle.com/javase/7/docs/technotes/tools/share/unpack200.html

I am able to solve the issue with shell scripting.
Written below code in my shell script file and run through my java code.
split -b 1000000 src.jar target.jar
cat src.jaraa src.jarab src.jarac src.jarad src.jarae > merged.jar
And compare with any algorithm like sha256 checksum will work fine and it shows equal. and size also equal.

Related

create zip file without writing to disk

I am working on a Springboot application that has to return a zip file to a frontend when the user downloads some report. I want to create a zip file without writing the zip file or the original files to disk.
The directory I want to zip contains other directories, that contain the actual files. For example, dir1 has subDir1 and subDir2 inside, subDir1 will have two file subDir1File1.pdf and subDir1File2.pdf. subDir2 will also have files inside.
I can do this easily by creating the physical files on the disk. However, I feel it will be more elegant to return these files without writing to disk.

You would use ByteArrayOutputStream if the scope was to write to memory. In essence, the zip file would be entirely contained in memory, so be sure that you don't risk to have too many requests at once and that the file size is reasonable in size! Otherwise this approach can seriously backfire!

You can use following snippet :
public static byte[] zip(final String str) throws IOException {
if (StringUtils.isEmpty(str)) {
throw new IllegalArgumentException("Cannot zip null or empty string");
}
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try (GZIPOutputStream gos = new GZIPOutputStream(bos)) {
gos.write(str.getBytes(StandardCharsets.UTF_8));
}
return bos.toByteArray();
}
But as stated in another answer, make sure you are not risking your program too much by loading everything into your java memory.

Please note that you should stream whenever possible. In your case, you could write your data to https://docs.oracle.com/javase/8/docs/api/index.html?java/util/zip/ZipOutputStream.html.
The only downside of this appproach is: the client won't be able to show a download status bar, because the server will not be able to send the "Content-length" header. That's because the size of a ZIP file can only be known after it has been generated, but the server needs to send the headers first. So - no temporary zip file - no file size beforehand.
You are also talking about subdirectories. This is just a naming issue when dealing with a ZIP stream. Each zip item needs to be named like this: "directory/directory2/file.txt". This will produce subdirectories when unzipping.

Processing/Java File Count Issue With File Pathway (Variable Type)

Although the Title isn't very understandable I do have a simple issue. So i'm trying to write some code in a Processing Sketch (https://processing.org/) which can count how many files are in a document. The problem is, is that it doesn't accept the variable type.
File folder = File("My File Path");
folder.listFiles().size;
It says the function File(String) doesn't exist. When I try to put the file path without quation marks, it still doesn't work!
If you have a solution then please use a functioning example so that I know how it works. Thanks for any help!

As Joakim Danielson says it is constructor so you need to use new keyword.
Below code will work for you.
File folder = new File("My File Path");
int fileLength = folder.listFiles().length;

It's a constructor so you need to use new
File folder = new File("My File Path");
//To get the number of files in the folder
folder.listFiles().length;

Assuming the "My File Path" folder is inside your sketch you need to provide the path to your sketch. Luckily Processing already provides a helper function: sketchPath()
Here's an example:
File folder = new File(sketchPath("My File Path"));
println("folder.exists: " + folder.exists());
if(folder.exists()){
println(folder.listFiles().length + " files and/or directories");
}else{
println("folder does not exist, double check the path");
}
Bare in mind there's also a dataPath() function which points to a folder named data in your sketch folder. The data folder is typically used for storing external data (e.g. assets (raster or vector images/Processing font files) or raw data (binary/text/csv/xml/json/etc.)). This is useful to separate your sketch source files from the data to be loaded/accessed by your sketch.
Also, Processing has a few utility functions for listing files and folders.
Be sure to check out Processing > Examples > Topics > File IO > DirectoryList
The example includes less documented functions such as listFiles() (which returns an array of java.io.File objects based on the filters set) or listPaths (which returns an array of String objects: just the paths).
The options and filters are quite handy, for example if you want to list directories only and ignore files you can simply write simply like:
println("directories: " + listFiles(sketchPath("My File Path"),"directories").length);
For example if want to list all the wav files in a data/audio directory inside the sketch you can use:
File[] files = listFiles(dataPath("audio"), "files", "extension=wav");
This will ignore directories and any other file that does not have .wav extension.
To make this answer complete, here are a few more details on the options for listFiles/listPaths from Processing's source code:
"relative" -> no effect with the Files version, but important for listPaths
"recursive"-> traverse nested directories
"extension=js" or "extensions=js|csv|txt" (no dot)
"directories" -> only directories
"files" -> only files
"hidden" -> include hidden files (prefixed with .) disabled by default

Programatically Extract Single Specific File From 7zip Archive - Java - Linux

I would really appreciate your input on the below scenario please.
The requirements:
- I have a 7zip archive file with several thousands of files in it
- I have a java application running on linux that is required to retrieve individual files from the 7 zip file
I would like to retrieve a file from the archive by its path (e.g. my7zFile.7z/file1.pdf) without having to iterate through all the files in the archive and comparing file names.
I would like to avoid having to extract all files from the archive before running the search (the uncompressed archive is several TB).
I had a look into 7zip Java Binding - specifically the IInArchive class, the only extract method seems to work via file index, not via file name:
http://sevenzipjbind.sourceforge.net/javadoc/net/sf/sevenzipjbinding/IInArchive.html
Do you know of any other libraries that could help me with this use case or am I overlooking a way of doing this with 7zip jbinding?
Thank you
Kind regards,
Tobi

Sadly it appears the API doesn't provide enough to fulfill all your requirements. In order to extract a single file it appears you need to walk the archive index. The simplified interface to the archive makes this much easier:
The ISimpleInArchive interface provides:
ISimpleInArchiveItem[] getArchiveItems()
Allowing you to retrieve an list of items in the archive.
The ISimpleInArchiveItem interface provides the method:
java.lang.String getPath()
Hence you can walk the archiveItems comparing on path. Granted this is against your requirements.
However, note this walks the index table and does not extract the files until requested. Once you have the item your after you can use:
ExtractOperationResult extractSlow(ISequentialOutStream SequentialOutStream)
on the item you have found to actually extract it.
Looking at the 7z file format (note this is not the official site of 7zip), the header information is all at the end of the file with the Signature header at the start of the file giving an offset to the start of the header info. So provided the SevenZip bindings are written nicely, your search will at most read the start of the file (SignatureHeader) to find the offset to the HeaderInfo section, then walk the HeaderInfo section in order to build up the file list required in getArchiveItems(). Only once you have the item you need will it shift back to the index of the actual stream for the file you want extracted (most likely when you call extractSlow).
So whilst not all your requirements are met, the overhead of the search/compare required is limited to only searching the header info of the archive.

Once I wrote a code to read from all the files and folders from a zip file. I had a long file(text)/folder hierarchy inside the zip file. I am not sure whether that will help you or not. I am sharing the skeleton of the code.
import java.util.zip.ZipEntry;
import java.util.zip.ZipFile;
ZipFile zipFile = new ZipFile(filepath); // filepath of the zip file
Enumeration<? extends ZipEntry> entries = zipFile.entries();
while (entries.hasMoreElements()) {
ZipEntry entry = entries.nextElement();
if (entry.isDirectory()) { // found directory inside the zipFile
// write your code here
} else {
InputStream stream = zipFile.getInputStream(entry);
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
// write your code to read the content of the file
}
}
You can modify the code to reach your desired file in the zip. But i don't think you will be able to access the file directly rather you have to walk through all the paths of the zip archive. Note that, ZipFile iterates through all file and folders inside a zipped file in DFS (Depth First Search) manner. You will find detailed relevant examples in web.

How In geb download zip file into the current directory?

I try to use this function:
downloadBytes(exportLink.#href)
but I get array of bytes. How can I get zip file.

A file is nothing but an array of bytes. What do you need to actually do? You can save it somewhere by using a FileOutputStream, for example.
You can use a ZipInputStream (with a ByteArrayInputStream) to read the entries directly in Java... So, what do you want to actually do?

I found another solution for saving zip in geb without asking directory.
I configured my GebConfig.groovy :
profile.setPreference("browser.download.folderList",2)
profile.setPreference("browser.download.manager.showWhenStarting",false)
profile.setPreference("browser.download.dir", new File("").getAbsolutePath())
profile.setPreference("browser.helperApps.neverAsk.saveToDisk","application/zip")

Can I store a file in an ArrayList in Java using getResource?

New to Java. I am building a Java HTTP server (no special libraries allowed). There are certain files I need to serve (templates is what I call them) and I was serving them up using this piece of code:
this.getClass().getResourceAsStream("/http/templates/404.html")
And including them in my .jar. This was working. (I realize I was reading them as an input stream.)
Now I want to store all of my files (as File type) for templates, regular files, redirects in a hashmap that looks like this: url -> file. The I have a Response class that serves up the files.
This works for everything except my templates. If I try to insert the getResource code in the hashmap, I get an error in my Response class.
This is my code that I am using to build my hashmap:
new File(this.getClass().getResource("/http/templates/404.html").getFile())
This is the error I'm getting:
Exception in thread "main" java.io.FileNotFoundException: file:/Users/Kelly/Desktop/Java_HTTP_Server/build/jar/server.jar!/http/templates/404.html (No such file or directory)
I ran this command and can see the templates in my jar:
jar tf server.jar
Where is my thinking going wrong? I think I'm missing a piece to the puzzle.
UPDATE: Here's a slice of what I get when I run the last command above...so I think I have the path to the file correctly?
http/server/serverSocket/SystemServerSocket.class
http/server/serverSocket/WebServerSocket.class
http/server/ServerTest.class
http/templates/
http/templates/404.html
http/templates/file_directory.html
http/templates/form.html

The FileNotFoundException error you are getting is not from this line:
new File(this.getClass().getResource("/http/templates/404.html").getFile())
It appears that after you are storing these File objects in hash map, you are trying to read the file (or serve the file by reading using FileInputStream or related APIs). It would have been more useful if you had given the stack trace and the code which is actually throwing this exception.
But the point is that files present within the JAR files are not the same as files on disk. In particular, a File object represents an abstract path name on disk and all standard libraries using File object assume that it is accessible. So /a/path/like/this is a valid abstract path name, but file:/Users/Kelly/Desktop/Java_HTTP_Server/build/jar/server.jar!/http/templates/404.html is not. This is exactly what you get when you call getResource("/http/templates/404.html").getFile(). It just returns a string representing something that doesn't exist as a file on disk.
There are two ways you can serve resources from class path directly:
Directly return the stream as a response to the request. this.getClass().getResourceAsStream() will return you the InputStream object which you can then return to the caller. This will require you to store an InputStream object in your hash map instead of a file. You can have two hash maps one for files from class path and one for files on disk.
Extract all the templates (possibly on first access) to a temporary location say /tmp and then store the File object representing the newly extracted file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.