I have "Unix Executable file" with no file extension.
In Mac, I am able to see the content in preview mode but not sure about any other way to see the content.
Looking for a way to read the content and store in some other file location as JPG file or PNG file format.
Not sure how to read this thru Java.
In Mac terminal, I tried "file filename" and got the following output.
PNG image data, 110 x 103, 8-bit/color RGB, non-interlaced
Whatever is reporting 'unix executable file' is oversimplifying things. It's simply 'a file' (unix has sod all to do with it), and the file system has the concept of an 'executable' flag, which you can set or clear on any file and is utterly unrelated to whether the file's contents are executable. You can set any file executable, or not, and especially considering that e.g. macs and linux can mount DOS file systems (Which most USB sticks use because every OS can deal with these file systems), which do not have this flag, and 'for convenience' that means the OS acts as if ALL files have that flag and you can't remove it. In other words, it's a lie, forget about that part.
file is just guessing. This is no blame on file and the authors of that tool are by no means lazy. It's mathematically impossible - the disk system doesn't know what kind of data a file contains, it just knows: This file has these bytes, and it ends there. file just looks at the contents and takes a wild stab in the dark. Its wild stabs are decent, but no guarantee. I can make you a file that is BOTH a legal zip file (will unzip and everything just fine), AND is a PNG image equally well (renders in browsers, preview, etc). What could file possibly tell you here? Literally completely random garbage is ALSO a valid ISO-8859-1 formatted text file. The only way to know that this is clearly not the intended purpose of the file is to use Artificial Intelligence algorithms to realize that the contents in no way form legible words in any language on the planet. That's a very hard problem and file doesn't try to solve it.
Thus, there's no real way to know if it is a PNG file, if all you have is a file on disk. The file extension is a good hint, but if it's missing, you're just guessing. You can toss it through a PNG reader, and if it doesn't crash, it probably is, but it could just be a picture with random static because it isn't really a PNG file.
If you want to convert PNG files, ImageIO can do that.
Generally, the process that got you that file usually DOES know the format. For example, if you download it over the web, the web server didn't JUST send those bytes over. It also sent this header: Content-Type: image/png. THAT (and not the file extension) is what is the webserver's canonical truth. If the process that saves this file to disk elected to take that information and toss it in the garbage, well, now you're stuck guessing. If possible, go back to that part of the process and fix it so this info is no longer tossed in the bin. For example, if you have a shell script that uses wget to download a resource and then later on you have no idea if it's a PNG, or a JPG, or the output of a 'file not found' explanatory page in HTML, then fix wget to save that header and react accordingly.
Related
I have an InputStream of data that is the content of a file, but does not have any file information attached. I would like to be able to distinguish between cases when the data represents a *.zip file, and cases where it is a container file format (e.g. *.docx, *.odt, *.jar) that uses zip under the covers. I don't necessarily need to know what the container format is, just whether a stream is a "plain" zip or not (so I know whether it's appropriate to split the stream into separate files or not).
Is this possible? I'm happy to do the detection either after decompressing or before.
Ideally I'm trying to do this in Java, but if there are code examples in other languages then I'm happy to port them across if necessary.
There's no absolutely reliable and correct way to do this, because those formats that use the ZIP format as a container tend to be 100% valid and correct ZIP files.
So they are ZIP files.
However, since there's not an infinite number of those formats (and only a smaller subset of those are commonly found in the real world), you can probably get away with just specifically detecting those formats and treating everything that you don't recognize as a "real" ZIP file.
Most of these formats require some kind of easy-to-check identifier in the early bytes of the file, so if you are okay with writing specification-specific code it should be easy enough.
file detects most of those formats correctly, so looking into its source should give you enough pointers.
Some examples:
OpenDocument files (this file contains all kinds of archives, not just ODx files).
Office Open XML files
It's also quite likely (haven't checked) that Apache Tika already does all that detection.
Would be possible to unzip tar.gz file partially e.g. unzip only few megabytes from the middle of the large tar.gz file ?
I got this idea as we have a lot zipped log files and it's very time consuming to unzip 100mb log file into ~1gb file and then search in it. Would be great to have option of 'partial unzip'.
Unless the .gz file was specially prepared for this purpose, then no, you need to decompress all of the data up to the middle in order to decompress what's in the middle.
It is possible to use Z_FULL_FLUSH in deflate() periodically to put breaks in in the compressed data to allow decompression starting at those break points. You would have to have a different file and your own software to keep track of where those breakpoints were, and how far into the uncompressed data they are.
Since it is a .tar.gz file, it would make sense to only have those breakpoints at file boundaries. The tar format itself can be read starting at any file header with no problem.
I'm working on a Java program that will allow me to view images in a zip/rar file without unzipping it to a folder on my hdd. I'd like to be able to flip through them like on a normal image viewer, possibly able to zoom in/out if needed.
From what I've looked into, it'll have to be extracted even if it's just to a temporary folder, which I'm fine with as long as the program can delete it on its own after. I believe ZipFile would be something I should do more research in? Most of what I've seen deal with text documents rather than image files, so I'm not sure how to proceed in my research.
I'm looking to see if I'm on the right track or if there are any good resources/specific apis I could look into to help as I haven't done any coding in months (save for light php and html) or any java in about a year, and I've had this on the back-burner for long enough.
Thanks in advance. :)
You're on the right track with ZipFile, and I don't think you need to extract to disk before viewing the images.
A ZipFile object will give you a list of its contents with entries(). You can iterate this collection of ZipEntry objects to present a choice of which file to view, and of course filter it to known extensions if you desire.
Strangely enough it's the ZipFile object and not the the individual ZipEntry objects that will give you an InputStream for the given entry. You can read this object into a byte[] in memory and send it to the component that will be responsible for displaying the image.
One caveat is that with zip files, in order to get to the last file stored in the zip it will basically have to decompress the whole archive which can be time consuming. So it may make sense to cache files on disk or an in-memory LRU cache depending on the usage pattern.
You didn't mention if this is for a Swing application, but if it is this might be helpful for displaying the images:
http://docs.oracle.com/javase/tutorial/uiswing/components/icon.html
I am quite new with this idea but I tried to open JPEG file in NOTEPAD, & without making any change i RE-Saved it with new name
let new.jpg
but when opened this new.jpg it is firing error in opening, any Viewer is not able to show the image.
Actually I want to open an image in stream of Binary Format(purely Binary) which can be saved in String & on other side it will be rearranged in Stream to save it as JPEG, I want to do this in JAVA. but before programming i tried an experiment as i earlier described but It is raising error.
Openning a JPEG file with Notepad will create error because it will mess up the encoding of some essential JPEG Marker.
Try to open your file with an Hexadecimal editing software (I use HexEdit and it work fine).
You should also take a look at the JPEG structure.
when you save a binary file with notepad it changes the encoding of some of the characters, that's why it is not recognised as a valid JPEG anymore.
i doubt there's a fast way to "go back" to the original file, it involves finding out which bytes were changed.
as for saving it to a string, what do you mean?
I have an application that take a zip file as input in Java.
My application would decompress the zip file and inside the zip file there are some file contains filename exceeds 256 chars
Could I modify the filename of a file in zip without decompression
OS : linux/mac
It's more of a limitation of the file system used, than the OS itself.
You can only change this by formatting the drives to a different file system that supports longer file names.
Why you would need a file name that is couple of miles long is beyond me. But only advise I can give is, try to shorten the file name.
::EDIT::
Since you updated your question. Here's the correct answer. :)
Typically, your approach is correct. Although if your zip contains longer filenames, you can truncate them. (Take the first, 250, ignore the rest. Now, you may have duplicate filenames. Add a number at the end coz you got 5 chars left)
Another option is to ask the user to enter a new file name.
It is possible to edit the zip file itself, as long as you know how it is structured .etc.
I'm not aware that Java built-in APIs allow editing zip files. Although a while back, I came across this library names DotNetZip for Microsoft.NET which allows all the typical functionality plus editing the entries inside a zip file, encryption, passwords .etc. (it is awesome btw)
Look for a similar library for Java.
If the user were supplying a filename for a file to be created, I think you'd just have to ask for a shorter name, short enough to avoid throwing this exception.
But it sounds like this is the name of an existing file to be read from; in that case, if the system (file or operating; either way, something you can't control) is saying its not a valid filename, how could you expect to be able to read from it?