Java - directly reading a file without downloading and storing locally [duplicate] - java

This question already has answers here:
How do I read / convert an InputStream into a String in Java?
(62 answers)
Closed 8 years ago.
I want to directly read a file, put it into a string without storing the file locally. I used to do this with an old project, but I don't have the source code anymore. I used to be able to get the source of my website this way.
However, I don't remember if I did it by "InputStream to String array of lines to String", or if I directly read it into a String.
Was there a function for this, or am I remembering wrong?
(Note: this function would be the PHP equivalent of file_get_contents($path))

You need to use InputStreamReader to convert from a binary input stream to a Reader which is appropriate for reading text.
After that, you need to read to the end of the reader.
Personally I'd do all this with Guava, which has convenience methods for this sort of thing, e.g. CharStreams.toString(Readable).
When you create the InputStreamReader, make sure you supply the appropriate character encoding - if you don't, you'll get junk text out (just like trying to play an MP3 file as if it were a WAV, for example).

Check out apache-commons-io and for your use case FileUtils.readFileToString(File file)
(should not be to hard to get a File form the path).
You can use the library or have a look at the code - as this is open.

There is no direct way to read a File into a String.
But there is a quick alternative - read the File into a Byte array and convert it into a String.
Untested:
File f = new File("/foo/bar");
InputStream fStream = new FileInputStream(f);
ByteArrayOutputStream bStream = new ByteArrayOutputStream();
for(int data = fStream.read(); data > -1; data = fStream.read()) {
b.write(data);
}
String theResult = new String(bStream.toByteArray(), "UTF-8");

Related

Reading a binary file from the file system as a BLOB to use in rhino with javascript

I'm planing to use SheetJS with rhino. And sheetjs takes a binary object(BLOB if i'm correct) as it's input. So i need to read a file from the system using stranded java I/O methods and store it into a blob before passing it to sheetjs. eg :-
var XLDataWorkBook = XLSX.read(blobInput, {type : "binary"});
So how can i create a BLOB(or appropriate type) from a binary file in java in order to pass it in.
i guess i cant pass streams because i guess XLSX needs a completely created object to process.
I found the answer to this by myself. i was able to get it done this way.
Read the file with InputStream and then write it to a ByteArrayOutputStream. like below.
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
...
buffer.write(bytes, 0, len);
Then create a byte array from it.
byte[] byteArray = buffer.toByteArray();
Finally i did convert it to a Base64 String (which is also applicable in my case) using the "Base64.encodeBase64String()" method in apache.commons.codec.binary package. So i can pass Base64 String as a method parameter.
If you further need there are lot of libraries(3rd-party and default) available for Base64 to Blob conversion as well.

equivalent to Files.readAllLines() for InputStream or Reader?

I have a file that I've been reading into a List via the following method:
List<String> doc = java.nio.file.Files.readAllLines(new File("/path/to/src/resources/citylist.csv").toPath(), StandardCharsets.UTF_8);
Is there any nice (single-line) Java 7/8/nio2 way to pull off the same feat with a file that's inside an executable Jar (and presumably, has to be read with an InputStream)? Perhaps a way to open an InputStream via the classloader, then somehow coerce/transform/wrap it into a Path object? Or some new subclass of InputStream or Reader that contains an equivalent to File.readAllLines(...)?
I know I could do it the traditional way in a half page of code, or via some external library... but before I do, I want to make sure that recent releases of Java can't already do it "out of the box".
An InputStream represents a stream of bytes. Those bytes don't necessarily form (text) content that can be read line by line.
If you know that the InputStream can be interpreted as text, you can wrap it in a InputStreamReader and use BufferedReader#lines() to consume it line by line.
try (InputStream resource = Example.class.getResourceAsStream("resource")) {
List<String> doc =
new BufferedReader(new InputStreamReader(resource,
StandardCharsets.UTF_8)).lines().collect(Collectors.toList());
}
You can use Apache Commons IOUtils#readLines:
List<String> doc = IOUtils.readLines(inputStream, StandardCharsets.UTF_8);

Using OpenCSV to convert CSV file to XML using byte array in input

I have a question:
I'm trying to convert my CSV file to XML file and I'm seeing the response of this post: Java lib or app to convert CSV to XML file?
I see that I need use this OpenCSV library and in particular, I must use this code:
CSVReader reader = new CSVReader(new FileReader(startFile));
where String startFile = "./startData.csv";
Now, I don't get a String as startFile, but I have a byte[] because, for other question, I have convert my file in byte[]. How can I use this code with byte[]?
Are there alternatives?
Thanks
Since CSVReader's constructor takes a Reader as parameter, you can pretty much pass anything that's readable to it.
So in your case, you may try using a bytes stream reader, as in:
CSVReader reader = new CSVReader(
new InputStreamReader(
new ByteArrayInputStream(yourByteArray)));

CSV file validation with Java

I'm reading a file line by line, like this:
FileReader myFile = new FileReader(File file);
BufferedReader InputFile = new BufferedReader(myFile);
// Read the first line
String currentRecord = InputFile.readLine();
while(currentRecord != null) {
currentRecord = InputFile.readLine();
}
But if other types of files are uploaded, it will still read their contents. For instance, if the uploaded file is an image, it will output junk characters when reading the file. So my question is: how can I check the file is CSV for sure before reading it?
Checking extension of the file is kind of lame since someone can upload a file that is not CSV but has a .csv extension. Thanks in advance.
Determining the MIME type of a file is not something easy to do, especially if ASCII sections can be mixed with binary ones.
Actually, when you look at how a java mail system does determine the MIME type of an email, it does involve reading all bytes in it, and applying some "rules".
Check out MimeUtility.java
If the primary type of this datasource is "text" and if all the bytes in its input stream are US-ASCII, then the encoding is "7bit".
If more than half of the bytes are non-US-ASCII, then the encoding is "base64".
If less than half of the bytes are non-US-ASCII, then the encoding is "quoted-printable".
If the primary type of this datasource is not "text", then if all the bytes of its input stream are US-ASCII, the encoding is "7bit".
If there is even one non-US-ASCII character, the encoding is "base64".
#return "7bit", "quoted-printable" or "base64"
As mentioned by mmyers in a deleted comment, JavaMimeType is supposed to do the same thing, but:
it is dead since 2006
it does involve reading the all content!
:
File file = new File("/home/bibi/monfichieratester");
InputStream inputStream = new FileInputStream(file);
ByteArrayOutputStream byteArrayStream = new ByteArrayOutputStream();
int readByte;
while ((readByte = inputStream.read()) != -1) {
byteArrayStream.write(readByte);
}
String mimetype = "";
byte[] bytes = byteArrayStream.toByteArray();
MagicMatch m = Magic.getMagicMatch(bytes);
mimetype = m.getMimeType();
So... since you are reading the all content of the file anyway, you could take advantage of that to determine the type based on that content and your own rules.
Java Mime Magic may be of use. It'll analyse mime-types from files and inputstreams. I can't vouch for it's functionality, however.
This link may provide further info. It provides several different means of determining how to do what you want (or at least something similar).
I would perhaps be tempted to write something specific to your problem domain. e.g. determining the number of comma-separated values per line and rejecting if it's not within certain limits. Then split on the commas and parse each entry according to requirements (e.g. are they doubles/floats/valid Strings - and if strings, what encoding). I think you may have to do this anyway, given that someone may upload a file that starts like a CSV but is corrupted half-way through.

Corrupt file when using Java to download file

This problem seems to happen inconsistently. We are using a java applet to download a file from our site, which we store temporarily on the client's machine.
Here is the code that we are using to save the file:
URL targetUrl = new URL(urlForFile);
InputStream content = (InputStream)targetUrl.getContent();
BufferedInputStream buffered = new BufferedInputStream(content);
File savedFile = File.createTempFile("temp",".dat");
FileOutputStream fos = new FileOutputStream(savedFile);
int letter;
while((letter = buffered.read()) != -1)
fos.write(letter);
fos.close();
Later, I try to access that file by using:
ObjectInputStream keyInStream = new ObjectInputStream(new FileInputStream(savedFile));
Most of the time it works without a problem, but every once in a while we get the error:
java.io.StreamCorruptedException: invalid stream header: 0D0A0D0A
which makes me believe that it isn't saving the file correctly.
I'm guessing that the operations you've done with getContent and BufferedInputStream have treated the file like an ascii file which has converted newlines or carriage returns into carriage return + newline (0x0d0a), which has confused ObjectInputStream (which expects serialized data objects.
If you are using an FTP URL, the transfer may be occurring in ASCII mode.
Try appending ";type=I" to the end of your URL.
Why are you using ObjectInputStream to read it?
As per the javadoc:
An ObjectInputStream deserializes primitive data and objects previously written using an ObjectOutputStream.
Probably the error comes from the fact you didn't write it with ObjectOutputStream.
Try reading it wit FileInputStream only.
Here's a sample for binary ( although not the most efficient way )
Here's another used for text files.
There are 3 big problems in your sample code:
You're not just treating the input as bytes
You're needlessly pulling the entire object into memory at once
You're doing multiple method calls for every single byte read and written -- use the array based read/write!
Here's a redo:
URL targetUrl = new URL(urlForFile);
InputStream is = targetUrl.getInputStream();
File savedFile = File.createTempFile("temp",".dat");
FileOutputStream fos = new FileOutputStream(savedFile);
int count;
byte[] buff = new byte[16 * 1024];
while((count = is.read(buff)) != -1) {
fos.write(buff, 0, count);
}
fos.close();
content.close();
You could also step back from the code and check to see if the file on your client is the same as the file on the server. If you get both files on an XP machine, you should be able to use the FC utility to do a compare (check FC's help if you need to run this as a binary compare as there is a switch for that). If you're on Unix, I don't know the file compare program, but I'm sure there's something.
If the files are identical, then you're looking at a problem with the code that reads the file.
If the files are not identical, focus on the code that writes your file.
Good luck!

Categories

Resources