java I/O try with resources for binary data parameters? - java

I'm rather confused on the parameters for reading and writing binary data.
When reading and writing bytes containing ASCII caharacters I understand the format is something like this try(FileInputStream fin = new FileInputStream(args[0])) . Where its mostly console arguments.
But I see that for the try with resources in reading and writing binary data its
try(DataInputStream dataIn = new DataInputStream(new FileInputStream("testdata"))) &
try(DataOutputStream dataOut = new DataOutputStream(new FileOutputStream("testdata"))
Why is new FileInputStream("testdata") written like this ? Why is it created as an object inside and what is "testdata" suppose to mean ?

I/O libraries can be very confusing because different communications have different needs. The architecture is intended to keep as much in common as possible. The various subclasses and wrapper classes either add characteristics (such as buffering) or processing (such as text encoding/decoding or datatype transformations).
(Sometimes it seems like the architecture is too fractured because each class adds so little and it takes two or three classes to do very common things. Commonly used 3rd-party libraries can make it easier.)
Think of it this way, if the object you have is awkward in some way, find the class in the library that makes it easier to use in the way you need. You haven't shown any code that does anything so I can't point to a specific advantage to using the classes that you mention.
"testdata" would be the name of what your operating system presents to your user's Java as a "file". In the common case, it is the name of a file, in what would be the current working directory for your program. Which directory that is might depend on how your users start your program. (Perhaps you are expecting a file name to have an extension."
Tip: Use an IDE that presents JavaDoc
Tip: Use an IDE that allows refactoring, particularly, extract local variable. That gives you a chance to pull out expressions that are confusing to you and give them a name.

Related

Not able to write to file - java

Pretty sure this should be really easy, but I can't write to a file. No I/O exception is thrown nothing. I was having a similar problem reading earlier and I tried a hundred different ways until
DataInputStream dis = new DataInputStream(reading.class.getResourceAsStream("hello.txt");
BufferedReader input = new BufferedReader(new InputStreamreader(dis));
this worked! and I could use scanners and such to read from this point.
FileReader, making File file = new File("hello.txt") whatever, none of that worked. I couldn't get any of it to even throw an error when it was an incorrect file name.
Now I have the same problem except for writing to a file but there's no equivilant to
reading.class.getResourceAsStream("hello.txt"); that makes an /output/ stream.
Does anyone know how to get the "ResourceAsStream" but as an output stream, /or/ does anyone know what my problem might be?
I know a lot of people on this website have reading/writing issues but none of the posts helped me.
note - yes I was closing, yes I was flushing, yes I had code that wrote to the file.
GetResourceAsStream is meant to read resources (e.g. property files) that were distributed and packages along with the code. There's no guarantee they're in writable mode, e.g. both code and resources could be distributed as a jar, or jar-inside-a-WAR-inside-an-EAR...
See here Write to a file stream returned from getResourceAsStream() for additional discussion and a workaround suggestion, though it's not very recommended IMHO. I think the reasonable practice is to distinguish between (a) your immutable code distribution (b) data editable at runtime ... the latter could reside on a different machine, have different policies for secuirty/replicatoin/backup, etc.

File type detection in Java without I/O

There is a built-in method in the Java JDK that detects file types:
Files.probeContentType(Paths.get("/temp/word.doc"));
The javadoc says that a FileTypeDetector may examine the filename, or it may examine a few bytes in the file, which means that it would have to actually try to pull the file from a URL.
This is unacceptable in our app; the content of the file is available only through an InputStream.
I tried to step through the code to see what the JDK is actually doing, but it seems that it goes to FileTypeDetectors.defaultFileTypeDetector.probeContentType(path) which goes to sun.nio.fs.AbstractFileTypeDetector, and I couldn't step into that code because there's no source attachment.
How do I use JDK file type detection and force it to use file content that I supply, rather than having it go out and perform I/O on its own?
The docs for Files.probeContentType() explain how to plug in your own FileTypeDetector implementation, but if you follow the docs you'll find that there is no reliable way to ensure that your implementation is the one that is selected (the idea is that different implementations serve as fallbacks for each other, not alternatives). There is certainly no documented way to prevent the built-in implementation from ever reading the target file.
You can surely find a map of common filename extensions to content types in various places around the web and probably on your own system; mime.types is a common name for such files. If you want to rely only on such a mapping file then you probably need to use your own custom facility, not the Java standard library's.
The JDK's Files.probeContentType() simply loads a FileTypeDetector available in your JDK installation and asks it to detect the MIME type. If none exists then it does nothing.
Apache has a library called Tika which does exactly what you want. It determines the MIME type of the given content. It can also be plugged into your JDK to make your Files.probeContentType() function using Tika. Check this tutorial for quick code - http://wilddiary.com/detect-file-type-from-content/
If you are worried about reading the contents of an InputStream you can wrap it in a PushBackInputStream to "unread" those bytes so the next detector implementation can read it.
Usually binary file's magic numbers are 4 bytes so having a new PushBackInputStream(in, 4) should be sufficient.
PushBackInputStream pushbackStream = new PushBackInputStream(in, 4);
byte[] magicNumber = new byte[4];
//for this example we will assume it reads whole array
//for production you will need to check all 4 bytes read etc
pushbackStream.read(magicNumber);
//now figure out content type basic on magic number
ContentType type = ...
//now pushback those 4 bytes so you can read the whole stream
pushbackStream.unread(magicNumber);
//now your downstream process can read the pushbackStream as a
//normal InputStream and gets those magic number bytes back
...

In Java / Android, why am I able to get an integer file descriptor from ParcelFileDescriptor?

I have solved a problem I had while coding on the NDK, but I'm not sure the solution is canonical, or if there is a canonical way to do this. First a description of what I did:
It appears that I cannot access an integer file descriptor value using Java's File or FileDescriptor objects:
http://developer.android.com/reference/java/io/File.html
http://developer.android.com/reference/java/io/FileDescriptor.html
But I can using Android's ParcelFileDescriptor object:
http://developer.android.com/reference/android/os/ParcelFileDescriptor.html
So I can get the integer fd to my native code in this way:
pfd = ParcelFileDescriptor.open(new File("blah"), MODE_WRITE_ONLY);
myNativeFunction(pfd.getFd());
Why does this work? Wouldn't the file descriptor integer field be private in the File object, since I can't access it even when I own the object? So how does the ParcelFileDescriptor get to access this presumably private field just by being passed the object in one of its public methods? Do I even want to know?
how does the ParcelFileDescriptor get to access this presumably private field
The integer file descriptor is not part of java.io.File class at all. This class wraps a file name and can be used for names that don't exist or even cannot exist on the file system. When you work with Java File object, the Linux file is not opened. So, to open the file, we use one of the many classes like FileInputStream, FileReader, etc.
FileDescriptor class, defined in the core Java API, can be used for some manipulations with open files; it does "know" the actual int file descriptor (you can check its source code), but Android SDK isolates this number, as #danske correctly explained.
android.os.ParcelFileDescriptor uses system library libnativehelper.so to find the value, you can see the relevant source code, too. You can actually look up the sources of the nativehelper, too.
The Java.io.File and Java.io.FileDescriptor are part of the base Java APIs defined by Sun, which, by design, never expose an integer file descriptor to the. This is likely done for the sake of abstraction, i.e. to make Java programs portable to custom operating systems that don't use simple integers to identify opened file descriptors.
On the other hand, android.os.ParcelFileDescriptor is an Android-specific API, and on Android, which is based on Linux, integers are always used to model file descriptors, so it's ok to expose it.
You probably don't need to know exactly how ParcelFileDescriptor performs its magic though. Just my 2 cents.

sending serialization file via sockets in java

System.out.println("Java is awesome!");
Pardon my enthusiasm; I just can't believe how powerful Java is, what with its ability to not only save objects (and load them), but also with its main purpose, to send them over a network. This is exactly what I must do, for I am conducting a beta-test. In this beta-test, I have given the testers a version of the game that saves the data as Objects in a location most people don't know about (we are the enlightened ones hahaha). This would work fine and dandy, except that it isn't meant for long-term persistence. But, I could collect their record.ser and counter.bin files (the latter tells me how many Objects are in record.ser) via some client/server interaction with sockets (which I know nothing about, until I started reading about it, but I still feel clueless). Most of the examples I have seen online (this one for example: http://uisurumadushanka89.blogspot.com/2010/08/send-file-via-sockets-in-java.html ) were sending the File as a stream of bytes, namely some ObjectOutputStream and ObjectInputStream. This is exactly what my current version of the game is using to save/load GameData.
Sorry for this long-winded intro, but do you know what I would have to do (steps-wise, so I can UNDERSTAND) to actually send the whole file. Would I have to reconstruct the file byte-by-byte (or Object-by-Object)?
Its pretty simple, actually. Just make your objects serializable, and create an ObjectOutputStream and ObjectInputStream that are connected to whatever underlying stream you have, say FileInputStream, etc. Then just write() whatever object you want to the stream and read it on the other side.
Heres an example for you.
For sockets it will be something like
ObjectOutputStream objectOut = new ObjectOutputStream(serverSocket.getOutputStream());
ObjectInputStream objectIn = new ObjectInputStream(clientSocket.getInputStream());
Java Serialization is an immensely powerful protocol. java.io.ObjectOutputStream and java.io.ObjectInputStream are the higher level classes which of course are wrapped with the lower level classes such as FileInputStream and FileOutputStream. My question is why do you wish to read the file byte by byte when the entire file can be read in Objects.
Here is a good explanation of the procedure.
http://www.tutorialspoint.com/java/java_serialization.html

Is it possible to prepend data to an file without rewriting?

I deal with very large binary files ( several GB to multiple TB per file ). These files exist in a legacy format and upgrading requires writing a header to the FRONT of the file. I can create a new file and rewrite the data but sometimes this can take a long time. I'm wondering if there is any faster way to accomplish this upgrade. The platform is limited to Linux and I'm willing to use low-level functions (ASM, C, C++) / file system tricks to make this happen. The primimary library is Java and JNI is completely acceptable.
There's no general way to do this natively.
Maybe some file-systems provide some functions to do this (cannot give any hint about this), but your code will then be file-system dependent.
A solution could be that of simulating a file-system: you could store your data on a set of several files, and then provide some functions to open, read and write data as if it was a single file.
Sounds crazy, but you can store the file data in reverse order, if it is possible to change function that reads data from file. In that case you can append data (in reverse order) at the end of the file. It is just a general idea, so I can't recommend anything particular.
The code for reversing of current file can looks like this:
std::string records;
ofstream out;
std::copy( records.rbegin(), records.rend(), std::ostream_iterator<string>(out));
It depends on what you mean by "filesystem tricks". If you're willing to get down-and-dirty with the filesystem's on-disk format, and the size of the header you want to add is a multiple of the filesystem block size, then you could write a program to directly manipulate the filesystem's on-disk structures (with the filesystem unmounted).
This enterprise is about as hairy as it sounds though - it'd likely only be worth it if you had hundreds of these giant files to process.
I would just use the standard Linux tools to do it.
Writting another application to do it seems like it would be sub-optimal.
cat headerFile oldFile > tmpFile && mv tmpFile oldFile
I know this is an old question, but I hope this helps someone in the future. Similar to simulating a filesystem, you could simply use a named pipe:
mkfifo /path/to/file_to_be_read
{ echo "HEADER"; cat /path/to/source_file; } > /path/to/file_to_be_read
Then, you run your legacy program against /path/to/file_to_be_read, and the input would be:
HEADER
contents of /path/to/source_file
...
This will work as long as the program reads the file sequentially and doesn't do mmap() or rewind() past the buffer.

Categories

Resources