This is what I do to write to InputStream
public OutputStream getOutputStream(#Nonnull final String uniqueId) throws PersistenceException {
final PipedOutputStream outputStream = new PipedOutputStream();
final PipedInputStream inputStream;
try {
inputStream = new PipedInputStream(outputStream);
new Thread(
new Runnable() {
#Override
public void run() {
PutObjectRequest putObjectRequest = new PutObjectRequest("haritdev.sunrun", "sample.file.key", inputStream, new ObjectMetadata());
PutObjectResult result = amazonS3Client.putObject(putObjectRequest);
LOGGER.info("result - " + result.toString());
try {
inputStream.close();
} catch (IOException e) {
}
}
}
).start();
} catch (AmazonS3Exception e) {
throw new PersistenceException("could not generate output stream for " + uniqueId, e);
} catch (IOException e) {
throw new PersistenceException("could not generate input stream for S3 for " + uniqueId, e);
}
try {
return new GZIPOutputStream(outputStream);
} catch (IOException e) {
LOGGER.error(e.getMessage(), e);
throw new PersistenceException("Failed to get output stream for " + uniqueId + ": " + e.getMessage(), e);
}
}
and in the following method, I see my process die
protected <X extends AmazonWebServiceRequest> Request<X> createRequest(String bucketName, String key, X originalRequest, HttpMethodName httpMethod) {
Request<X> request = new DefaultRequest<X>(originalRequest, Constants.S3_SERVICE_NAME);
request.setHttpMethod(httpMethod);
if (bucketNameUtils.isDNSBucketName(bucketName)) {
request.setEndpoint(convertToVirtualHostEndpoint(bucketName));
request.setResourcePath(ServiceUtils.urlEncode(key));
} else {
request.setEndpoint(endpoint);
if (bucketName != null) {
/*
* We don't URL encode the bucket name, since it shouldn't
* contain any characters that need to be encoded based on
* Amazon S3's naming restrictions.
*/
request.setResourcePath(bucketName + "/"
+ (key != null ? ServiceUtils.urlEncode(key) : ""));
}
}
return request;
}
The process fails on request.setResourcePath(ServiceUtils.urlEncode(key)); and I can't even debug because of that, even though the key is valid name and is not NULL
Can someone please help?
This is how the request looks before dying
request = {com.amazonaws.DefaultRequest#1931}"PUT https://my.bucket.s3.amazonaws.com / "
resourcePath = null
parameters = {java.util.HashMap#1959} size = 0
headers = {java.util.HashMap#1963} size = 0
endpoint = {java.net.URI#1965}"https://my.bucket.s3.amazonaws.com"
serviceName = {java.lang.String#1910}"Amazon S3"
originalRequest = {com.amazonaws.services.s3.model.PutObjectRequest#1285}
httpMethod = {com.amazonaws.http.HttpMethodName#1286}"PUT"
content = null
I tried the same approach and it failed for me as well.
I ended up writing all my data to the output stream first, and then initiating the upload to S3 after copying the data from the output stream to the input stream:
...
// Data written to outputStream here
...
byte[] byteArray = outputStream.toByteArray();
amazonS3Client.uploadPart(new UploadPartRequest()
.withBucketName(bucket)
.withKey(key)
.withInputStream(new ByteArrayInputStream(byteArray))
.withPartSize(byteArray.length)
.withUploadId(uploadId)
.withPartNumber(partNumber));
Kind of defeats the purpose of writing to a stream, if the entire data block has to be written and copied in memory before the upload to S3 can even begin, but it's the only way I could get it to work.
Here is what I tried and worked -
try (PipedOutputStream pipedOutputStream = new PipedOutputStream();
PipedInputStream pipedInputStream = new PipedInputStream()) {
new Thread(new Runnable() {
public void run() {
try {
// write some data to pipedOutputStream
} catch (IOException e) {
// handle exception
}
}
}).start();
PutObjectRequest putObjectRequest = new PutObjectRequest(BUCKET, FILE_NAME, pipedInputStream, new ObjectMetadata());
s3Client.putObject(putObjectRequest);
}
This code worked with S3 throwing warning that content-length is not set and s3 will be buffered and could result in OutOfMemoryException. I am not convinced about any cheap method to set content-length in ObjectMetadata just to get rid of this message and hopefully AWS SDK would not be stream the whole stream into memory just to find the content-length.
Related
I have list of java objects which I need to upload to [Amazon] s3. Currently I am uplaoding each object one by one which is quite inefficient. Each object in the list is first serialized to JSON and separated by a newline, i.e. \n
private ByteBufferOutputStream serialize(final Object object) {
try {
final ByteBufferOutputStream outputStream = new ByteBufferOutputStream(128, true, false);
Json.writeValueAsToOutputStream(object, outputStream);
outputStream.write('\n');
return outputStream;
} catch (final Exception e) {
log.error("Error serializing data {}", object, e);
return null;
}
}
for (Data event : events) {
final ByteBufferOutputStream serializedEvent = serialize(event);
if (serializedEvent == null) {
continue;
}
PutObjectRequest objectRequest = PutObjectRequest.builder()
.bucket(bucketName)
.key(key)
.build();
try {
s3Client.putObject(objectRequest, RequestBody.fromByteBuffer(serializedEvent.toByteBuffer()));
} catch (S3Exception e) {
log.error("Failed to send raw data events to S3", e);
}
}
How can I upload all the objects in a list only once after serialization ?
I am creating a tar.Gzip file using GZIPOutputStream and I have added an another logic in that if any Exception caught while compressing file then, my code will retry three times.
When I am throwing an IOException to test my retry logic it throwing a below Exception:
java.io.IOException: request to write '4096' bytes exceeds size in header of '2644' bytes for entry 'Alldbtypes'
I am getting Exception at line: org.apache.commons.io.IOUtils.copyLarge(inputStream, tarStream);
private class CompressionStream extends GZIPOutputStream {
// Use compression levels from the deflator class
public CompressionStream(OutputStream out, int compressionLevel) throws IOException {
super(out);
def.setLevel(compressionLevel);
}
}
public void createTAR(){
boolean isSuccessful=false;
int count = 0;
int maxTries = 3;
while(!isSuccessful) {
InputStream inputStream =null;
FileOutputStream outputStream =null;
CompressionStream compressionStream=null;
OutputStream md5OutputStream = null;
TarArchiveOutputStream tarStream = null;
try{
inputStream = new BufferedInputStream(new FileInputStream(rawfile));
File stagingPath = new File("C:\\Workarea\\6d22b6a3-564f-42b4-be83-9e1573a718cd\\b88beb62-aa65-4ad5-b46c-4f2e9c892259.tar.gz");
boolean isDeleted = false;
if(stagingPath.exists()){
isDeleted = stagingPath.delete();
if(stagingPath.exists()){
try {
FileUtils.forceDelete(stagingPath);
}catch (IOException ex){
//ignore
}
}
}
outputStream = new FileOutputStream(stagingPath);
if (isCompressionEnabled) {
compressionStream = new
CompressionStream(outputStream, getCompressionLevel(om));
}
final MessageDigest outputDigest = MessageDigest.getInstance("MD5");
md5OutputStream = new DigestOutputStream(isCompressionEnabled ? compressionStream : outputStream, outputDigest);
tarStream = new TarArchiveOutputStream(new BufferedOutputStream(md5OutputStream));
tarStream.setLongFileMode(TarArchiveOutputStream.LONGFILE_GNU);
tarStream.setBigNumberMode(TarArchiveOutputStream.BIGNUMBER_STAR);
TarArchiveEntry entry = new TarArchiveEntry("Alldbtypes");
entry.setSize(getOriginalSize());
entry.setModTime(getLastModified().getMillis());
tarStream.putArchiveEntry(entry);
org.apache.commons.io.IOUtils.copyLarge(inputStream, tarStream);
inputStream.close();
tarStream.closeArchiveEntry();
tarStream.finish();
tarStream.close();
String digest = Hex.encodeHexString(outputDigest.digest());
setChecksum(digest);
setIngested(DateTime.now());
setOriginalSize(FileUtils.sizeOf(stagingPath));
isSuccessful =true;
} catch (IOException e) {
if (++count == maxTries) {
throw new RuntimeException("Exception: " + e.getMessage(), e);
}
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException(Exception("MD5 hash algo not installed.");
} catch (Exception e) {
throw new RuntimeException("Exception: " + e.getMessage(), e);
} finally {
org.apache.commons.io.IOUtils.closeQuietly(inputStream);
try {
tarStream.flush();
tarStream.finish();
} catch (IOException e) {
e.printStackTrace();
}
org.apache.commons.io.IOUtils.closeQuietly(tarStream);
org.apache.commons.io.IOUtils.closeQuietly(compressionStream);
org.apache.commons.io.IOUtils.closeQuietly(md5OutputStream);
org.apache.commons.io.IOUtils.closeQuietly(outputStream);
}
}
}
Case solved. This Exception java.io.IOException: request to write '4096' bytes exceeds size in header of '2644' bytes for entry 'Alldbtypes' thrown when the size of the file that going to be zipped is incorrect.
TarArchiveEntry entry = new TarArchiveEntry("Alldbtypes");
entry.setSize(getOriginalSize());
In my code getOriginalSize() is getting updated again at the end so in retry the original size became change and original size is now zipped file size so it was throwing this Exception.
Few days ago, I struggled with how to access file sent by NettyClient without killing NettyServer. I got solution on StackOverFlow and the detail of question is here. The solution is that the client close channel after sending the file, and the server close the fileoutputstream in channelInactive method. The main code is below.
ClientHandler
public class FileClientHandler extends ChannelInboundHandlerAdapter {
private int readLength = 128;
#Override
public void channelActive(ChannelHandlerContext ctx) throws Exception {
sendFile(ctx.channel());
}
private void sendFile(Channel channel) throws IOException {
File file = new File("C:\\Users\\xxx\\Desktop\\1.png");
FileInputStream fis = new FileInputStream(file);
BufferedInputStream bis = new BufferedInputStream(fis);
ChannelFuture lastFuture = null;
for (;;) {
byte[] bytes = new byte[readLength];
int readNum = bis.read(bytes, 0, readLength);
if (readNum == -1) { // The end of the stream has been reached
bis.close();
fis.close();
lastFuture = sendToServer(bytes, channel, 0);
if(lastFuture == null) { // When our file is 0 bytes long, this is true
channel.close();
} else {
lastFuture.addListener(ChannelFutureListener.CLOSE);
}
return;
}
lastFuture = sendToServer(bytes, channel, readNum);
}
}
private ChannelFuture sendToServer(byte[] bytes, Channel channel, int length)
throws IOException {
return channel.writeAndFlush(Unpooled.copiedBuffer(bytes, 0, length));
}
}
ServerHandler
public class FileServerHandler extends ChannelInboundHandlerAdapter {
private File file = new File("C:\\Users\\xxx\\Desktop\\2.png");
private FileOutputStream fos;
public FileServerHandler() {
try {
if (!file.exists()) {
file.createNewFile();
} else {
file.delete();
file.createNewFile();
}
fos = new FileOutputStream(file);
} catch (IOException e) {
e.printStackTrace();
}
}
#Override
public void channelInactive(ChannelHandlerContext ctx) {
System.out.println("I want to close fileoutputstream!");
try {
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
}
#Override
public void channelRead(ChannelHandlerContext ctx, Object msg)
throws Exception {
ByteBuf buf = (ByteBuf) msg;
try {
buf.readBytes(fos, buf.readableBytes());
} catch (Exception e) {
e.printStackTrace();
} finally {
buf.release(); // Should always be done, even if writing to the file fails
}
}
}
If now I need to send 10 thousands pictures but every picture is small like 1KB. I have to close and then establish channel frequently. It is a thing wasting many resources. How can I only close fileoutputstream but the channel is alive?
This is just an idea, and I have not tested it, but rather than sending each file in its own connection, you could start a stream where you send:
The number of files to be sent (once)
The file info and content (for each file)
The file size
The file name size
The file name
The file content (bytes)
The client would look something like this:
public void sendFiles(Channel channel, File...files) {
ByteBufAllocator allocator = PooledByteBufAllocator.DEFAULT;
int fileCount = files.length;
// Send the file count
channel.write(allocator.buffer(4).writeInt(fileCount));
// For each file
Arrays.stream(files).forEach(f -> {
try {
// Get the file content
byte[] content = Files.readAllBytes(f.toPath());
byte[] fileName = f.getAbsolutePath().getBytes(UTF8);
// Write the content size, filename and the content
channel.write(allocator.buffer(4 + content.length + fileName.length)
.writeInt(content.length)
.writeInt(fileName.length)
.writeBytes(fileName)
.writeBytes(content)
);
} catch (IOException e) {
throw new RuntimeException(e); // perhaps do something better here.
}
});
// Flush the channel
channel.flush();
}
On the server side, you would need a slightly more sophisticated channel handler. I was thinking of a replaying decoder. (Example here)
In that example, the decoder will read all the files and then forward to the next handler which would receive a list of Upload instances, but you could send each upload up the pipeline after each received file so you don't allocate as much memory. But the intent is to send all your files in one stream rather than having to connect/disconnect for each file.
I am trying to download 800 MB file from google drive in a streamed fashion. Like I fetch bytes from google drive & write to my response output stream & flush it. Here is the code for it
public void downloadFileAsStream(String accessToken, String fileId
, HttpServletResponse response) throws Exception {
Credential credential = new GoogleCredential().setAccessToken(accessToken);
Drive service = new Drive.Builder(HTTP_TRANSPORT, JSON_FACTORY, credential).build();
File file = null;
try {
file = service.files().get(fileId).setFields("name, size").execute();
} catch (Exception ex) {
logger.error("Exception occurred while getting file from google drive", ex);
throw ex;
}
long fileSize = file.getSize();
OutputStream ros = response.getOutputStream();
for (long i = 0; i<= fileSize; i=i+10000000) {
byte[] fileRangeBytes = getBytes(service, accessToken, fileId, i, directDownloadThreshold);
ros.write(fileRangeBytes);
ros.flush();
}
ros.close();
}
private byte[] getBytes(Drive drive, String accessToken, String fileId, long position, long byteCount) throws Exception {
byte[] receivedByteArray = null;
String downloadUrl = "https://www.googleapis.com/drive/v3/files/" + fileId + "?alt=media&access_token="
+ accessToken;
try {
com.google.api.client.http.HttpRequest httpRequestGet = drive.getRequestFactory().buildGetRequest(new GenericUrl(downloadUrl));
httpRequestGet.getHeaders().setRange("bytes=" + position + "-" + (position + byteCount - 1));
com.google.api.client.http.HttpResponse response = httpRequestGet.execute();
InputStream is = response.getContent();
receivedByteArray = IOUtils.toByteArray(is);
response.disconnect();
} catch (IOException e) {
e.printStackTrace();
throw e;
}
return receivedByteArray;
}
The problem here is that the files are not getting downloaded in the browser immediately in chunks
Rather my application just keeps waiting till the whole file is written to response's outputstream.
So why is the flushing to the browser not happening in my case though I have responseOutputStream.flush() inside the for loop like in this question Java file download hangs
I am creating Restful web service that accepts any file and saves it into filesystem. I am using Dropwizard to implement the service and Postman/RestClient to hit the request with data. I am not creating multipart (form-data) request.
Every thing is working fine except the file saved has first character missing. Here is my code for calling the service method and saving it into file system:
Input Request:
http://localhost:8080/test/request/Sample.txt
Sample.txt
Test Content
Rest Controller
#PUT
#Consumes(value = MediaType.WILDCARD)
#Path("/test/request/{fileName}")
public Response authenticateDevice(#PathParam("fileName") String fileName, #Context HttpServletRequest request) throws IOException {
.......
InputStream inputStream = request.getInputStream();
writeFile(inputStream, fileName);
......
}
private void writeFile(InputStream inputStream, String fileName) {
OutputStream os = null;
try {
File file = new File(this.directory);
file.mkdirs();
if (file.exists()) {
os = new FileOutputStream(this.directory + fileName);
logger.info("File Written Successfully.");
} else {
logger.info("Problem Creating directory. File can not be saved!");
}
byte[] buffer = new byte[inputStream.available()];
int n;
while ((n = inputStream.read(buffer)) != -1) {
os.write(buffer, 0, n);
}
} catch (Exception e) {
logger.error("Error in writing to File::" + e);
} finally {
try {
os.close();
inputStream.close();
} catch (IOException e) {
logger.error("Error in closing input/output stream::" + e);
}
}
}
In output, file is saved but first character from the content is missing.
Output:
Sample.txt:
est Content
In above output file, character T is missing and this happens for all the file formats.
I don't know what point I am missing here.
Please help me out on this.
Thank You.