Need to upload and parse 15MB files, open files twice? - java

I have a file which I need to upload to a service, and parse into relevant data. The parser and the uploader both require an InputStream. Ought I to open the file twice? I could save the file to a String, but having many of these files in memory is concerning.
EDIT: Thought I should make it clear that the parsing and uploading are entirely separate processes.

Since you are parsing it already it would be most efficient to load the file into a string. Parse it into indexes to the string, you will save memory and can just upload the string whenever you want to. This would be the most effective way, with memory but maybe not processing time.
A reply to one of the comments above.
Separate processes does not mean different threads or processes, just they do not need each other to operate.

Related

Reading file on remote server via Java

I am developing a Java application through which I need to read the log files present on a server and perform operations depending on the content of the logs.
Files range from 3GB up to 9GB.
Here on stack I have already read the discussion about reading large files with java, I am attaching the link:
Java reading large file discussion
In the discussion, the files are read locally,
in my case i have to retrieve and read the file on the server, is there an efficient way to achieve this?
I would like to avoid having to download files given their size.
I had thought about using URL Reader to retrieve the files, but I have doubts about the speed of execution.
The files I need to recover are under the path C:\production\LOG\file.log
Do you have any suggestions or advice?

Java Servelet 3.0 File Upload to input stream - without intermediate folders or files being created

I dont know how to do this, or whether is possible or wise, so any form of answer that points me to a library, example or reasoning will be helpful.
I need to upload and process some Java XML files (actually, XSLT files - XML Excel files).
I dont want to store the file on the server and then invoke processing on it. Instead, I want to stream the file in, and process it as a stream.
I also want to be able to process multipart file uploads, but still process that as an input stream.
I am expressly trying to avoid creating a file on disk for this.

Java heap size error in mirth

I am using mirth connect 3.0.3 and i am having a .xml file which is almost 85mb size and contains some device information. i need to read this .xml file and insert that data to the database(sql server).
the problem i am facing is when i try to read the data it is showing java heap size error:
i increased server memory to 1024mb and client memory to 1024mb.
but it is showing the same error. if i increase the memory to more, i am not able to start mirth connect.
any suggestion is appreciated.
Thanks.
Is the XML file comprised of multiple separate sections/pieces of data that would make sense to split up into multiple channel messages? If so, consider using a Batch Adapter. The XML data type has options to split based on element/tag name, the node depth/level, or an XPath query. All of those options currently still require the message to read into memory in its entirety, but it will still be more memory-efficient than reading the entire XML document in as a single message.
You can also use a JavaScript batch script, in which case you're given a Java BufferedReader, and can use the script to read through the file and return a message at a time. In this case, you will not have to read the entire file into memory.
Are there large blobs of data in the message that don't need to be manipulated in a transformer? Like, embedded images, etc? If so, consider using an Attachment Handler. That way you can extract that data and store it once, rather than having it copied and stored multiple times throughout the message lifecycle (for Raw / Transformed / Encoded / etc.).

Java XML practice

I have an application that receives weather information every x seconds. I am wanting to save this data to an XML file.
Should I create a new XML file for each weather notification, or append each notification to the same XML file? I am not sure of the XML standards of what is common practice.
I highly recommend appending not because that is a standard practice of XML, but more because creating a new file every x seconds will likely be a very difficult way to manage your data. You may also run into limitations of your file system (e.g. maximum files per directory).
You might also consider using a database instead of files to store your data.
XML files have only one root element. You can write multiple XML fragments into the file but it won't be a valid document then. So while both options are fine, and you should consider your other requirements too, the standard somewhat nudges you towards writing a file (or a database row) per notification.

Sending files by part in JAVA

I am writing a client-server program in JAVA in which I am sending a file from server to client.As the file size may be quite high therefore I decided to divide the file in 5 parts and then send it to the same client in 5 different Threads.
My Algorithm is to use Java Zip API and create a zip file of the file to be sent,then I will divide the Zip file into 5 parts.
The problem is that there is not method in [ZIP API][2] that could divide the file.
This is the tutorial that I am referring for sending files through Thread.
Anyone who can guide me is there anything wrong with my Algorithm Or do I have to do with different strategy?
You should separate the zipping part from the splitting part. If you have to send these to a client, you probably don't want to keep the complete zip file in memory while you wait for the client to request the next chunk... so the simplest approach would be to zip to disk first, and then serve that file in chunks. At that point, it really doesn't matter that it's a zip file at all - and indeed for certain files types (e.g. images, sound, video) you may not want to go via a zip file at all.
I would suggest you tell the client the file name and size, and then let the client request whatever section of the file it wants. It can then decide what chunk size to use: you just need to seek to the right bit of the file and serve as much data as the client has requested.
Breaking up the file isn't a ZIP function. You could create multiple byte arrays from the resulting zip file (by segmenting the array) and sending each segment in a different thread. This would be similar to what download managers of yesteryear would do.
The client would then have code to re-assemble the byte array in the correct order. You'd probably need to add some additional information to each segment like the correct sequence, the filename to be restored, and the number of segments expected.

Categories

Resources