I have a (possibly long) list of binary files that I want to read lazily. There will be too many files to load into memory. I'm currently reading them as a MappedByteBuffer with FileChannel.map(), but that probably isn't required. I want the method readBinaryFiles(...) to return a Java 8 Stream so I can lazy load the list of files as I access them.
public List<FileDataMetaData> readBinaryFiles(
List<File> files,
int numDataPoints,
int dataPacketSize )
throws
IOException {
List<FileDataMetaData> fmdList = new ArrayList<FileDataMetaData>();
IOException lastException = null;
for (File f: files) {
try {
FileDataMetaData fmd = readRawFile(f, numDataPoints, dataPacketSize);
fmdList.add(fmd);
} catch (IOException e) {
logger.error("", e);
lastException = e;
}
}
if (null != lastException)
throw lastException;
return fmdList;
}
// The List<DataPacket> returned will be in the same order as in the file.
public FileDataMetaData readRawFile(File file, int numDataPoints, int dataPacketSize) throws IOException {
FileDataMetaData fmd;
FileChannel fileChannel = null;
try {
fileChannel = new RandomAccessFile(file, "r").getChannel();
long fileSz = fileChannel.size();
ByteBuffer bbRead = ByteBuffer.allocate((int) fileSz);
MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSz);
buffer.get(bbRead.array());
List<DataPacket> dataPacketList = new ArrayList<DataPacket>();
while (bbRead.hasRemaining()) {
int channelId = bbRead.getInt();
long timestamp = bbRead.getLong();
int[] data = new int[numDataPoints];
for (int i=0; i<numDataPoints; i++)
data[i] = bbRead.getInt();
DataPacket dp = new DataPacket(channelId, timestamp, data);
dataPacketList.add(dp);
}
fmd = new FileDataMetaData(file.getCanonicalPath(), fileSz, dataPacketList);
} catch (IOException e) {
logger.error("", e);
throw e;
} finally {
if (null != fileChannel) {
try {
fileChannel.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return fmd;
}
Returning fmdList.Stream() from readBinaryFiles(...) won't accomplish this because the file contents will already have been read into memory, which I won't be able to do.
The other approaches to reading the contents of multiple files as a Stream rely on using Files.lines(), but I need to read binary files.
I'm, open to doing this in Scala or golang if those languages have better support for this use case than Java.
I'd appreciate any pointers on how to read the contents of multiple binary files lazily.
There is no laziness possible for the reading within the a file as you are reading the entire file for constructing an instance of FileDataMetaData. You would need a substantial refactoring of that class to be able to construct an instance of FileDataMetaData without having to read the entire file.
However, there are several things to clean up in that code, even specific to Java 7 rather than Java 8, i.e you don’t need a RandomAccessFile detour to open a channel anymore and there is try-with-resources to ensure proper closing. Note further that you usage of memory mapping makes no sense. When copy the entire contents into a heap ByteBuffer after mapping the file, there is nothing lazy about it. It’s exactly the same what happens, when call read with a heap ByteBuffer on a channel, except that the JRE can reuse buffers in the read case.
In order to allow the system to manage the pages, you have to read from the mapped byte buffer. Depending on the system, this might still not be better than repeatedly reading small chunks into a heap byte buffer.
public FileDataMetaData readRawFile(
File file, int numDataPoints, int dataPacketSize) throws IOException {
try(FileChannel fileChannel=FileChannel.open(file.toPath(), StandardOpenOption.READ)) {
long fileSz = fileChannel.size();
MappedByteBuffer bbRead=fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileSz);
List<DataPacket> dataPacketList = new ArrayList<>();
while(bbRead.hasRemaining()) {
int channelId = bbRead.getInt();
long timestamp = bbRead.getLong();
int[] data = new int[numDataPoints];
for (int i=0; i<numDataPoints; i++)
data[i] = bbRead.getInt();
dataPacketList.add(new DataPacket(channelId, timestamp, data));
}
return new FileDataMetaData(file.getCanonicalPath(), fileSz, dataPacketList);
} catch (IOException e) {
logger.error("", e);
throw e;
}
}
Building a Stream based on this method is straight-forward, only the checked exception has to be handled:
public Stream<FileDataMetaData> readBinaryFiles(
List<File> files, int numDataPoints, int dataPacketSize) throws IOException {
return files.stream().map(f -> {
try {
return readRawFile(f, numDataPoints, dataPacketSize);
} catch (IOException e) {
logger.error("", e);
throw new UncheckedIOException(e);
}
});
}
This should be sufficient:
return files.stream().map(f -> readRawFile(f, numDataPoints, dataPacketSize));
…if, that is, you are willing to remove throws IOException from the readRawFile method’s signature. You could have that method catch IOException internally and wrap it in an UncheckedIOException. (The problem with deferred execution is that the exceptions also need to be deferred.)
I don't know how performant this is, but you can use java.io.SequenceInputStream wrapped inside of DataInputStream. This will effectively concatenate your files together. If you create a BufferedInputStream from each file, then the whole thing should be properly buffered.
Building on VGR's comment, I think his basic solution of:
return files.stream().map(f -> readRawFile(f, numDataPoints, dataPacketSize))
is correct, in that it will lazily process the files (and stop if a short-circuiting terminal action is invoked off the result of the map() operation. I would also suggest a slightly different to the implementation of readRawFile that leverages try with resources and InputStream, which will not load the whole file into memory:
public FileDataMetaData readRawFile(File file, int numDataPoints, int dataPacketSize)
throws DataPacketReadException { // <- Custom unchecked exception, nested for class
FileDataMetadata results = null;
try (FileInputStream fileInput = new FileInputStream(file)) {
String filePath = file.getCanonicalPath();
long fileSize = fileInput.getChannel().size()
DataInputStream dataInput = new DataInputStream(new BufferedInputStream(fileInput);
results = new FileDataMetadata(
filePath,
fileSize,
dataPacketsFrom(dataInput, numDataPoints, dataPacketSize, filePath);
}
return results;
}
private List<DataPacket> dataPacketsFrom(DataInputStream dataInput, int numDataPoints, int dataPacketSize, String filePath)
throws DataPacketReadException {
List<DataPacket> packets = new
while (dataInput.available() > 0) {
try {
// Logic to assemble DataPacket
}
catch (EOFException e) {
throw new DataPacketReadException("Unexpected EOF on file: " + filePath, e);
}
catch (IOException e) {
throw new DataPacketReadException("Unexpected I/O exception on file: " + filePath, e);
}
}
return packets;
}
This should reduce the amount of code, and make sure that your files get closed on error.
Related
I am trying to allow a user to download a file (attachment) using Java to serve up the download. I have been partially successful. The file is read, and on the client side there is a prompt for a download. A file is saved successfully, but it has 0 bytes. Here is my server side code:
String stored = "/var/lib/tomcat/webapps/myapp/attachments/" + request.getParameter("stored");
String realname = request.getParameter("realname");
// Open the input and output streams
FileInputStream attachmentFis = new FileInputStream(stored);
FileOutputStream attachmentFos = new FileOutputStream(realname);
try {
// Send the file
byte[] attachmentBuffer = new byte[1024];
int count = 0;
while((count = attachmentFis.read(attachmentBuffer)) != -1) {
attachmentFos.write(attachmentBuffer, 0, count);
}
} catch (IOException e) {
// Exception handling
} finally {
// Close the streams
attachmentFos.flush();
attachmentFos.close();
attachmentFis.close();
}
For context, this is in a servlet. The files have an obfuscated name, which is passed as "stored" here. The actual file name, the name the user will see, is "realname".
What do I need to do to get the actual file to arrive at the client end?
EDIT
Following suggestions in the comments, I changed the write to include the 0, count parameters and put the close stuff in a finally block. However, I am still getting a 0 byte file when I attempt a download.
EDIT 2
Thanks to the logging suggestion from Dave the Dane, I discovered the file was being written locally. A bit of digging and I found I needed to use response.getOutputStream().write instead of a regular FileOutputStream. I have been successful in getting a file to download through this method. Thank you all for your helpful suggestions.
As others have observed, you'd be better off using try-with-resources & let that handle the closing.
Assuming you have some Logging Framework available, maybe the following would cast light on the matter...
try {
LOG.info ("Requesting....");
final String stored = "/var/lib/tomcat/webapps/myapp/attachments/" + request.getParameter("stored");
LOG.info ("stored.......: {}", stored);
final String realname = request.getParameter("realname");
LOG.info ("realname.....: {}", realname);
final File fileStored = new File(stored);
LOG.info ("fileStored...: {}", fileStored .getCanonicalPath());
final File fileRealname = new File(realname);
LOG.info ("fileRealname.: {}", fileRealname.getCanonicalPath());
try(final InputStream attachmentFis = new FileInputStream (fileStored);
final OutputStream attachmentFos = new FileOutputStream(fileRealname))
{
final byte[] attachmentBuffer = new byte[64 * 1024];
int count;
while((count = attachmentFis.read (attachmentBuffer)) != -1) {
; attachmentFos.write(attachmentBuffer, 0, count);
LOG.info ("Written......: {} bytes to {}", count, realname);
}
attachmentFos.flush(); // Probably done automatically in .close()
}
LOG.info ("Done.");
}
catch (final Exception e) {
LOG.error("Problem!.....: {}", request, e);
}
If it won't reach the finally block, you should stop ignoring the IOException which is being thrown:
catch (IOException e) {
// Exception handling
System.err.println(e.getMessage());
}
I'd asssume that the realname is just missing an absolute path.
I made a program which accesses some URLs and downloads the pdfs from there. The files vary between 2MB to 40MB. The program works with no problems but is there a way to improve the perfomance on this? For the larger files it takes a long time to do it.
The code below is the one used for reading / writing the file. This is called in a for loop with different fileNameURLPath.
#Override
public void downloadFile(String fileNameURLPath, String titleCellValue) throws FileException {
try (BufferedInputStream inputStream
= new BufferedInputStream(new URL(fileNameURLPath).openStream())){
FileOutputStream fileOS = new FileOutputStream(FileConstants.MandatoryDownloadProperties.path + titleCellValue + ".pdf");
byte data[] = new byte[32*1024];
int byteContent;
while((byteContent = inputStream.read(data,0 , data.length)) != -1) {
fileOS.write(data, 0 , byteContent);
}
inputStream.close();
fileOS.close();
} catch (MalformedURLException e) {
throw new FileException("Error while processing url. Make sure it is correct");
} catch (IOException e) {
throw new FileException("Error while downloading file. Make sure the download path is correct");
}
}
I read something about Java NIO but I couldn't quite comprehend it or if it can help me in this situation
I am writing a function to take a text file and count how many lines it has while outputting the lines to an array of strings. Doing this I have several exceptions I need to look out for. The class function has several variables that should have a scope throughout the function but when I write a value to the function inside of an exception, the return statement cannot find it. I've moved the declaration around and nothing helps
The value returned "h5Files" "Might not have been initialized" Since I don't know how long the array will be I cannot initialize it to a certain length. I do this within the code and I need a way to tell the return statement that I now have a values
Here is the code
public String[] ReadScanlist(String fileIn){
int i;
String directory ="c:\\data\\"; // "\" is an illegal character
System.out.println(directory);
int linereader = 0;
String h5Files[];
File fileToRead = new File(directory + fileIn);
System.out.println(fileToRead);
try {
FileInputStream fin = new FileInputStream(fileToRead); // open this file
}
catch(FileNotFoundException exc) {
System.out.println("File Not Found");
}
try{
//read bytes until EOF is detected
do {
FileReader fr = new FileReader(fileToRead);// Need to convert to reader
LineNumberReader lineToRead = new LineNumberReader(fr); // Use line number reader class
//
while (lineToRead.readLine() != null){
linereader++;
}
linereader = 0;
lineToRead.setLineNumber(0); //reset line number
h5Files = new String[linereader];
while (lineToRead.readLine() != null){
h5Files[linereader] = lineToRead.readLine(); // deposit string into array
linereader++;
}
return h5Files;
}
while(i !=-1); // When i = -1 the end of the file has been reached
}
catch(IOException exc) {
System.out.println("Error reading file.");
}
try{
FileInputStream fin = new FileInputStream(fileToRead);
fin.close(); // close the file
}
catch(IOException exc) {
System.out.println("Error Closing File");
}
return h5Files;
}
Your code is very very odd. For example these two blocks make no sense:
try {
FileInputStream fin = new FileInputStream(fileToRead); // open this file
}
catch(FileNotFoundException exc) {
System.out.println("File Not Found");
}
try{
FileInputStream fin = new FileInputStream(fileToRead);
fin.close(); // close the file
}
catch(IOException exc) {
System.out.println("Error Closing File");
}
I don't know what you think they do, but besides the first one leaking memory, they do nothing at all. The comments are more worrying, they suggest that you need to do more reading on IO in Java.
Deleting those blocks and tidying the code a (moving declarations, formatting) gives this:
public String[] ReadScanlist(String fileIn) {
String directory = "c:\\data\\";
String h5Files[];
File fileToRead = new File(directory + fileIn);
try {
int i = 0;
do {
FileReader fr = new FileReader(fileToRead);
LineNumberReader lineToRead = new LineNumberReader(fr);
int linereader = 0;
while (lineToRead.readLine() != null) {
linereader++;
}
linereader = 0;
lineToRead.setLineNumber(0);
h5Files = new String[linereader];
while (lineToRead.readLine() != null) {
h5Files[linereader] = lineToRead.readLine();
linereader++;
}
return h5Files;
} while (i != -1);
} catch (IOException exc) {
System.out.println("Error reading file.");
}
return h5Files;
}
My first bone of contention is the File related code. First, File abstracts from the underlying OS, so using / is absolutely fine. Second, there is a reason File has a File, String constructor, this code should read:
File directory = new File("c:/data");
File fileToRead = new File(directory, fileIn);
But it should really be using the new Path API anyway (see below).
So, you declare h5Files[]. You then proceed to read the whole file to count the lines. You then assign h5Files[] to an array of the correct size. Finally you fill the array.
If you have an error anywhere before you assign h5Files[] you have not initialised it and therefore cannot return it. This is what the compiler is telling you.
I don't know what i does in this code, it is assigned to 0 at the top and then never reassigned. This is an infinite loop.
So, you need to rethink your logic. I would recommend throwing an IOException if you cannot read the file. Never return null - this is an anti-pattern and leads to all those thousands of null checks in your code. If you never return null you will never have to check for it.
May I suggest the following alternative code:
If you are on Java 7:
public String[] ReadScanlist(String fileIn) throws IOException {
final Path root = Paths.get("C:/data");
final List<String> lines = Files.readAllLines(root.resolve(fileIn), StandardCharsets.UTF_8);
return lines.toArray(new String[lines.size()]);
}
Or, if you have Java 8:
public String[] ReadScanlist(String fileIn) throws IOException {
final Path root = Paths.get("C:/data");
try (final Stream<String> lines = Files.lines(root.resolve(fileIn), StandardCharsets.UTF_8)) {
return lines.toArray(String[]::new);
}
}
Since I don't know how long the array will be I cannot initialize it
to a certain length.
I don't think an array is the correct solution for you then - not to say it can't be done, but you would be re-inventing the wheel.
I would suggest you use a LinkedList instead, something like:
LinkedList<String> h5Files = new LinkedList<>();
h5Files.add(lineToRead.readLine());
Alternatively you could re-invent the wheel by setting the array to an arbritary value, say 10, and then re-size it whenever it gets full, something like this:
h5Files = new String[10];
if (linereader = h5Files.size())
{
String[] temp = h5Files;
h5Files = new String[2 * linereader];
for (int i = 0; i < linereader; i++)
{
h5Files[i] = temp[i];
}
}
Either one of these solutions would allow you to initialize the array (or array alternative) in a safe constructor, prior to your try block, such that you can access it if any exceptions are thrown
Here is your problem. Please take a look on digested version of your code with my comments.
String h5Files[]; // here you define the variable. It still is not initialized.
try{
..................
do {
h5Files = new String[linereader]; // here you initialize the variable
} while(i !=-1); // When i = -1 the end of the file has been reached
..................
catch(IOException exc) {
// if you are here the variable is still not initialized
System.out.println("Error reading file.");
}
// you continue reading file even if exception was thrown while opening the file
I think that now the problem is clearer. You try to open the file and count lines. If you succeed you create array. If not (i.e. when exception is thrown) you catch the exception but still continue reading the file. But in this case you array is not initialized.
Now how to fix this?
Actually if you failed to read the file first time you cannot continue. This may happen for example if file does not exist. So, you should either return when first exception is thrown or just do not catch it at all. Indeed there is nothing to do with the file if exception was thrown at any phase. Exception is not return code. This is the reason that exceptions exist.
So, just do not catch exceptions at all. Declare your method as throws IOException and remove all try/catch blocks.
I've been stuck on this for a couple of days, I just cant find a good method to open a .txt file. I need to convert it into a multiline string and set it as a textview. Can someone help me please?
as for the file location, I'm using:
String saveLoc = Environment.getExternalStorageDirectory()+"/My Documents/";
public String title;
public String Ftype = ".txt";
(saveLoc+title+Ftype) //is the file location.
I can normally read the data into a file input stream, but if I try and do anything with it I get loads of errors that wont let my app even run.
use Apache Commons IO FileUtils.readLines()
readLines
public static List readLines(File file,
Charset encoding)
throws IOException
Reads the contents of a file line by line to a List of Strings. The file is always closed.
Parameters:
file - the file to read, must not be null
encoding - the encoding to use, null means platform default
Returns:
the list of Strings representing each line in the file, never null
Throws:
IOException - in case of an I/O error
Since:
2.3
private static String readFile(String path) throws IOException
{
FileInputStream stream = new FileInputStream(new File(path));
try {
FileChannel fc = stream.getChannel();
MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
/* Instead of using default, pass in a decoder. */
return Charset.defaultCharset().decode(bb).toString();
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
stream.close();
}
}
private String readTxt(){
InputStream inputStream = new FileInputStream("Text File Path Here");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
int i;
try {
i = inputStream.read();
while (i != -1)
{
byteArrayOutputStream.write(i);
i = inputStream.read();
}
inputStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return byteArrayOutputStream.toString();
}
I had a problem with reading /proc/%d/stat files using my Java method copyFiles() (source code below).
I have found workaround using similar readProc() method.
Now I am wondering what was the problem. Output files were created, but each file had 0 bytes (in /proc/ all files are 0 bytes because it is not standard filesystem). FileUtils is from the Apache Commons IO library.
I've tried to do the same using java.nio - again, IOException is being thrown that attributes are wrong for each file.
I removed some part of the code regarding parsing exceptions etc.
Why does this work with FileInputStream, but not with FileUtils.copyFile()?
public void copyFiles() {
final File dir = new File("/proc");
final String[] filedirArray = dir.list();
long counter = 0;
for(String filedir : filedirArray) {
final File checkFile = new File(dir, filedir);
if (checkFile.isDirectory()) {
try {
Integer.parseInt(filedir);
File srcFile = new File(checkFile, "stat");
File dstFile = new File("/home/waldekm/files/stat" + "." + Long.toString(counter++));
try {
FileUtils.copyFile(srcFile, dstFile);
} catch (IOException e1) {}
} catch (NumberFormatException e) {
// not a number, do nothing
}
}
}
}
public static void readProc(final String src, final String dst) {
FileInputStream in = null;
FileOutputStream out = null;
File srcFile = new File(src);
File dstFile = new File(dst);
try {
in = new FileInputStream(srcFile);
out = new FileOutputStream(dstFile);
int c;
while((c = in.read()) != -1) {
out.write(c);
}
} catch (IOException e1) {
} finally {
try {
if (in != null) {
in.close();
}
} catch (IOException e1) {}
try {
if (out != null) {
out.close();
}
} catch (IOException e1) {}
}
The reason is most likely that the operating system is reporting the file size as zero.
On my machine, man 2 stat says this:
"For most files under the /proc directory, stat() does not return the file size in the st_size field; instead the field is returned with the value 0."
(The stat system call will be what the JVM uses to find out what a file's size is.)
Here is a code snipped that would read specific fields from a proc file, using methods that are available (but not documented directly) in the Process class of Android. Modify the FORMAT buffer and the output buffer size to read more/different values from the proc file,
int PROC_SPACE_TERM = (int)' ';
int PROC_OUT_LONG = 0x2000
public static final int[] PROCESS_STATS_FORMAT = new int[] {
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM,
PROC_SPACE_TERM|PROC_OUT_LONG, // 13: utime
PROC_SPACE_TERM|PROC_OUT_LONG // 14: stime
};
long buf[] = new long[2];
try {
int pid = 1000; // Assume 1000 is a valid pid for a process.
Method mReadProcFile =
Process.class.getMethod("readProcFile", String.class,
int[].class, String[].class,
long[].class, float[].class);
mReadProcFile.invoke(null, "/proc/" + pid + "/stat",
PROCESS_STATS_FORMAT, null, buf, null);
return buf;
} catch(NoSuchMethodException e) {
Log.e(TAG, "Error! Could not get access to JNI method - readProcFile");
} catch (InvocationTargetException e) {
Log.e(TAG, "Error! Could not invoke JNI method - readProcFile");
} catch (IllegalAccessException e) {
Log.e(TAG, "Error! Illegal access while invoking JNI method - readProcFile");
}
return null;
I see you are creating a FileInputStream to read a /proc file. Instead I suggest you create a FileReader object. FileInputStream gets tripped up by the lack of file length for /proc files but FileReader does not.