S3Async client complete multipart upload after application restart

S3Async client complete multipart upload after application restart - java

Related to older question:
My application uses AWS Java S3Async client for multipart upload.
When application pauses and resumes upload it goes ok, but when application is restarted new uploadid is generated and completing upload does not work (Exception: One or more of the specified parts could not be found. The part may not have been uploaded, or the specified entity tag may not match the part's entity tag.) probably because some parts are uploaded using different uploadId.
Same happens when using AWS cli to complete upload.

Related

Downloading big files and passing them back to the client in chunks

The background:
I have a Java Spring boot application.
I would like to have an API endpoint where the user can request to download a file.
The file is generated by the application and stored in some remote storage (S3 for example), prior to the download request
The customer does not have an access to the remote storage.
He sends a request to the application and as far as he is concerned the file is coming from the application.
The file downloaded can be very big.
The question
Is there a way to have the application be used as some sort of a tunnel?
Meaning - the application gets a request, start downloading the file chunk by chunk, and every downloaded chunk is also returned to the client.
This will create a similar experience to downloading from S3 itself and does not force the application to save the file locally or upload the whole thing into its memory

Java Spring: Real-time status update to the client over REST API

I am developing a web application in Java Spring where I want the user to be able to upload a CSV file from the front-end and then see the real-time progress of the importing process and after importing he should be able to search individual entries from the imported data.
The importing process would consist of actually uploading the file (sending it via REST API POST request) and then reading it and saving its contents to a database so the user would be able to search from this data.
How could I show the real-time progress of this process? I found a tutorial for jQuery, which shows the progress of amount of data uploaded/transferred, but as the most the work is done while processing the uploaded file, I thought I would like a solution where before the line processing I find out the amount of lines in the file and then the user could see a live message like:
Lines processed: 1 out of 10000
It could update/change incrementally, but as one line is processed pretty quickly, showing each number of lines processed is not that important.
Either way, the question is, what's the easiest way to send these messages from Spring REST API to the client?

I found a solution myself and used Web Sockets for that.
I used this approach from the Spring documentation:
https://spring.io/guides/gs/messaging-stomp-websocket/
It could help on sending the messages for each processed line to the front end listener (after the web socket topic/connection is started) but I used a different approach for the data import, I used batch insert so that was unavailable for me, but web sockets are capable of doing that.

download of a large file from webservice causing application performance issue

I have gone through the previous posts and similar questions were asked but I couldn't find a solution to my problem.
In my application user can download the files, So when user clicks on download our application server internally set's up the authenticated session with the web service which send the file data in the XML response as below :
<FileSerial xmlns="http://my.example.com/webservices">
<filedata>base64Binary</filedata>
<filesize>int</filesize>
<filetype>string</filetype>
<mime_type>string</mime_type>
</FileSerial>
And I have used spring-ws as below :
GetDocResponse docResponse = (GetDocResponse) webServiceTemplate.marshalSendAndReceive(getDoc);
FileSerial fileSerial = docResponse.getGetDocResult();
fileByte = fileSerial.getFiledata();
After several users hit the download our application server JVM memory goes very high and application server doesn't respond and have to be restarted.
My guess is that the fileByte is stored in my application server memory and that's causing the issue.
Is there any way to stream directly into the client browser without storing it in the application server memory.
Any sample code will be of help.

You are loading the full doc on your heap, plus the conversion to base64. If you don't use copy by reference every map from your binary data to other object is creating another entry on your heap.
You should use a multi part request and send the doc as an attachment of your ws request
MTOM example

Java Multipart/post download

I have few question related to Web technologies. From my reading ant looking at Apache and Netty documents I could not figure out few things about downloading a large file with HTTP multipart/post request.
Is it possible to send HTTP request indicating request to download a file in smaller multipart (chunks)?
How to download large file in multipart ?
Please correct me if I have not understood the 'multipart' term itself. I know lot of people have faced this problem, where application (client) downloads files in smaller portion, so when network outage happens, application does not need to download whole file from the beginning again. Specially, when the file is not any media file.
Thanks.

Multipart refers to encoding multiple documents in one body, see this for the definition. For http, a multipart upload allows the client to send multiple documents with one post, for example uploading an image, and form fields in one request.
Multipart does not refer to downloading a document in multiple chunks.
You can use http ranges to restart downloading if a network outage occurs.

Make two servers talk to each other

I have application written in GWT and hosted on Google AppEngine/Java. In this application user will have an option to upload video/audio/text file to the server. Those files could be big, up to 1gb or so and because GAE/J does not support large file I have to use another server to store those files. This would be easy to implement if there was no cross-domain security feature in browsers. So, what I'm thinking is to make GAE Server talk to my server (Glassfish or any other java servers if needed) to tell url to the file and if possible send status of uploaded file (how many percent was uploaded) so I can show status on clients screen. Here is what I'm thinking to do.
When user loads GWT page that is stored on GAE/J he/she will upload file to my server, then my server will send response back to GAE and GAE will send response to the client.
If this scenario is possible what would be the best way to implement GAE to Glassfish conversation?

Actually before that maybe you can try using first approach via by-passing cross-domain security of browsers using iframe. There are some ready to use components for this but for your problem which of them can be usable I don't know. Just google for these components...

Doing it the original way you suggested use URL Fetch Service
The down side to doing it the other way is that you introduce dependencies on multiple sites inside your web pages.
The downside of using the URL Fetch Service is that you have to pay by number of bytes transferred after you have reached the free quota.

One option would be to wait - the blobstore limit won't always be 50MB!
If you're in a hurry, though, I would suggest an approach like the following:
Have your App Engine app generate a signed token that signifies the user has permission to upload a file. The token should include the current date and time, the user's user ID, the maximum file size, and any other relevant information, and should be signed using HMAC-SHA1 with a secret key that your App Engine app and your server both know.
Return a form to the user that POSTs to a URL on your blob hosting server, and embeds the token you generated in step 1. If you want progress notifications, you can use a tool like plupload, and serve the form in an IFrame served by your upload server.
When the user uploads the file to your server, the server should return a redirect back to your App Engine app, with a new token embedded in the redirect URL. That token, again signed with a common secret, contains the ID of the newly uploaded file.
When your App Engine app receives a request for the redirect URL, it knows the upload was completed, and can record the new file's ID etc in the datastore.
Alternately, you can use Amazon's S3, which already supports all this with its HTML Form support.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

S3Async client complete multipart upload after application restart - java

Related

Downloading big files and passing them back to the client in chunks

Java Spring: Real-time status update to the client over REST API

download of a large file from webservice causing application performance issue

Java Multipart/post download

Make two servers talk to each other

Categories

Resources