Java AmazonS3 putObject fails silently - java

Others have posted about this without receiving an answer and now I'm experiencing this same issue. Actually it has been going on for 9 months but I'm just now noticing it.
This sequence does not throw an exception and the message at the end is logged:
AmazonS3 s3 = AmazonS3ClientBuilder.defaultClient();
String bucket = "...";
String key = "...";
File f = new File("...");
PutObjectResult r = s3.putObject(bucket, key, f);
String etag = r.getETag();
LOGGER.info("file ... saved with etag = "+etag);
The file is not present in the bucket when I look.
This fails a few dozen times a day out of thousands of files posted. There are 25 active threads using this sequence of code. Is aws-java-sdk thread safe? Other ideas?
This is running on an ec2 instance in the amazon cloud.
Details:
aws-java-sdk-s3-1.11.693.jar
java: 1.8.0_201-b09
ubuntu: 4.4.0-1077-aws

I saw this issue when uploading a 27MB file using AmazonS3.putObject. The command returned a PutObjectResult but there was no S3 object in the expected location and the result metadata (result.getMetadata().getContentLength()) showed 0 bytes. I fixed this by using multi-part upload per this link.
private static final Logger LOGGER = LoggerFactory.getLogger(S3Handler.class);
private static final long MAX_SINGLE_PART_UPLOAD_BYTES = 5 * 1024 * 1024;
private final AmazonS3 amazonS3;
public S3Handler(AmazonS3 amazonS3) {
this.amazonS3 = amazonS3;
}
public void putS3Object(String bucket, String objectKey, File file) {
if (file.length() <= MAX_SINGLE_PART_UPLOAD_BYTES) {
putS3ObjectSinglePart(bucket, objectKey, file);
} else {
putS3ObjectMultiPart(bucket, objectKey, file);
}
}
private void putS3ObjectSinglePart(String bucket, String objectKey, File file) {
PutObjectRequest request = new PutObjectRequest(bucket, objectKey, file);
PutObjectResult result = amazonS3.putObject(request);
long bytesPushed = result.getMetadata().getContentLength();
LOGGER.info("Pushed {} bytes to s3://{}/{}", bytesPushed, bucket, objectKey);
}
private void putS3ObjectMultiPart(String bucket, String objectKey, File file) {
long contentLength = file.length();
long partSize = MAX_SINGLE_PART_UPLOAD_BYTES;
List<PartETag> partETags = new ArrayList<>();
// Initiate the multipart upload.
InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(bucket, objectKey);
InitiateMultipartUploadResult initResponse = amazonS3.initiateMultipartUpload(initRequest);
// Upload the file parts.
long fileOffset = 0;
for (int partNumber = 1; fileOffset < contentLength; ++partNumber) {
// Because the last part could be less than 5 MB, adjust the part size as needed.
partSize = Math.min(partSize, (contentLength - fileOffset));
// Create the request to upload a part.
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(bucket)
.withKey(objectKey)
.withUploadId(initResponse.getUploadId())
.withPartNumber(partNumber)
.withFileOffset(fileOffset)
.withFile(file)
.withPartSize(partSize);
// Upload the part and add the response's ETag to our list.
UploadPartResult uploadResult = amazonS3.uploadPart(uploadRequest);
LOGGER.info("Uploading part {} of Object s3://{}/{}", partNumber, bucket, objectKey);
partETags.add(uploadResult.getPartETag());
fileOffset += partSize;
}
// Complete the multipart upload.
CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest(bucket, objectKey, initResponse.getUploadId(), partETags);
amazonS3.completeMultipartUpload(compRequest);
}

Related

Upload of large files using azure-sdk-for-java with limited heap

We are developing document microservice that needs to use Azure as a storage for file content. Azure Block Blob seemed like a reasonable choice. Document service has heap limited to 512MB (-Xmx512m).
I was not successful getting streaming file upload with limited heap to work using azure-storage-blob:12.10.0-beta.1 (also tested on 12.9.0).
Following approaches were attempted:
Copy-paste from the documentation using BlockBlobClient
BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();
File file = new File("file");
try (InputStream dataStream = new FileInputStream(file)) {
blockBlobClient.upload(dataStream, file.length(), true /* overwrite file */);
}
Result: java.io.IOException: mark/reset not supported - SDK tries to use mark/reset even though file input stream reports this feature as not supported.
Adding BufferedInputStream to mitigate mark/reset issue (per advice):
BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();
File file = new File("file");
try (InputStream dataStream = new BufferedInputStream(new FileInputStream(file))) {
blockBlobClient.upload(dataStream, file.length(), true /* overwrite file */);
}
Result: java.lang.OutOfMemoryError: Java heap space. I assume that SDK attempted to load all 1.17GB of file content into memory.
Replacing BlockBlobClient with BlobClient and removing heap size limitation (-Xmx512m):
BlobClient blobClient = blobContainerClient.getBlobClient("file");
File file = new File("file");
try (InputStream dataStream = new FileInputStream(file)) {
blobClient.upload(dataStream, file.length(), true /* overwrite file */);
}
Result: 1.5GB of heap memory used, all file content is loaded into memory + some buffering on the side of Reactor
Heap usage from VisualVM
Switch to streaming via BlobOutputStream:
long blockSize = DataSize.ofMegabytes(4L).toBytes();
BlockBlobClient blockBlobClient = blobContainerClient.getBlobClient("file").getBlockBlobClient();
// create / erase blob
blockBlobClient.commitBlockList(List.of(), true);
BlockBlobOutputStreamOptions options = (new BlockBlobOutputStreamOptions()).setParallelTransferOptions(
(new ParallelTransferOptions()).setBlockSizeLong(blockSize).setMaxConcurrency(1).setMaxSingleUploadSizeLong(blockSize));
try (InputStream is = new FileInputStream("file")) {
try (OutputStream os = blockBlobClient.getBlobOutputStream(options)) {
IOUtils.copy(is, os); // uses 8KB buffer
}
}
Result: file is corrupted during upload. Azure web portal shows 1.09GB instead of expected 1.17GB. Manual download of the file from Azure web portal confirms that file content was corrupted during upload. Memory footprint decreased significantly, but file corruption is a showstopper.
Problem: cannot come up with a working upload / download solution with small memory footprint
Any help would be greatly appreciated!
Pls try the code below to upload/download big files, I have tested on my side using a .zip file with size about 1.1 GB
For uploading files:
public static void uploadFilesByChunk() {
String connString = "<conn str>";
String containerName = "<container name>";
String blobName = "UploadOne.zip";
String filePath = "D:/temp/" + blobName;
BlobServiceClient client = new BlobServiceClientBuilder().connectionString(connString).buildClient();
BlobClient blobClient = client.getBlobContainerClient(containerName).getBlobClient(blobName);
long blockSize = 2 * 1024 * 1024; //2MB
ParallelTransferOptions parallelTransferOptions = new ParallelTransferOptions()
.setBlockSizeLong(blockSize).setMaxConcurrency(2)
.setProgressReceiver(new ProgressReceiver() {
#Override
public void reportProgress(long bytesTransferred) {
System.out.println("uploaded:" + bytesTransferred);
}
});
BlobHttpHeaders headers = new BlobHttpHeaders().setContentLanguage("en-US").setContentType("binary");
blobClient.uploadFromFile(filePath, parallelTransferOptions, headers, null, AccessTier.HOT,
new BlobRequestConditions(), Duration.ofMinutes(30));
}
Memory footprint:
For downloading files:
public static void downLoadFilesByChunk() {
String connString = "<conn str>";
String containerName = "<container name>";
String blobName = "UploadOne.zip";
String filePath = "D:/temp/" + "DownloadOne.zip";
BlobServiceClient client = new BlobServiceClientBuilder().connectionString(connString).buildClient();
BlobClient blobClient = client.getBlobContainerClient(containerName).getBlobClient(blobName);
long blockSize = 2 * 1024 * 1024;
com.azure.storage.common.ParallelTransferOptions parallelTransferOptions = new com.azure.storage.common.ParallelTransferOptions()
.setBlockSizeLong(blockSize).setMaxConcurrency(2)
.setProgressReceiver(new com.azure.storage.common.ProgressReceiver() {
#Override
public void reportProgress(long bytesTransferred) {
System.out.println("dowloaded:" + bytesTransferred);
}
});
BlobDownloadToFileOptions options = new BlobDownloadToFileOptions(filePath)
.setParallelTransferOptions(parallelTransferOptions);
blobClient.downloadToFileWithResponse(options, Duration.ofMinutes(30), null);
}
Memory footprint:
Result:

Why am I getting a AmazonS3Exception?

my task is to encrypt a file that is uploaded to S3. The upload worked fine before the encryption but now after I encrypted the file I get this exception.
The XML you provided was not well-formed or did not validate against our published schema
I added this to the existing Code
final AwsCrypto crypto = new AwsCrypto();
try (
final FileInputStream in = new FileInputStream(encryptfile);
final FileOutputStream out = new FileOutputStream(file);
final CryptoOutputStream<?> encryptingStream = crypto.createEncryptingStream(crypt, out))
{
IOUtils.copy(in, encryptingStream);
}
My thoughts, Why does AmazonS3 expect a XML-File ? Why not a normal text document ?
Is there a Option to change this maybe with the Bucket Policy ?
EDIT
That is the upload code, maybe there is a Issue. I dont understand why it´s working without the encryption.
File uploaffile = encryptFile(file);
List<PartETag> partETags = new ArrayList<PartETag>();
String filename = String.valueOf(System.currentTimeMillis());
InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(awss3bucket, filename);
InitiateMultipartUploadResult initResponse = amazons3.initiateMultipartUpload(initRequest);
long partSize = 5 * 1024 * 1024;
long contentLength = uploaffile.length();
long filePosition = 0;
for (int i = 1; filePosition < contentLength; i++) {
partSize = Math.min(partSize, (contentLength - filePosition));
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(awss3bucket)
.withKey(filename)
.withUploadId(initResponse.getUploadId())
.withPartNumber(i)
.withFileOffset(filePosition)
.withFile(uploaffile)
.withPartSize(partSize);
PartETag petag = new PartETag(amazons3.uploadPart(uploadRequest).getPartNumber(), amazons3.uploadPart(uploadRequest).getETag());
partETags.add(petag);
filePosition += partSize;
}
CompleteMultipartUploadRequest compRequest = new CompleteMultipartUploadRequest(awss3bucket, filename,
initResponse.getUploadId(), partETags);
amazons3.completeMultipartUpload(compRequest);
Maybe next time I should stop to copy some random Code from the Internet.
You use The FileoutputStream to write not the other way. So the File so was Empty which created the Exception.
CryptoInputStream<KmsMasterKey> encryptingStream = crypto.createEncryptingStream(crypt, in);
FileOutputStream out = null;
try {
out = new FileOutputStream(encryptfile);
IOUtils.copy(encryptingStream, out);
encryptingStream.close();
out.close();
} catch (IOException e)

AWS lambda (server less) file upload using java

I'm trying to upload file to AWS S3 bucket using Lambda server less application written in Java.
I'm hitting the endpoint with file using postman option binary (screenshot attached).
And I'm receiving binary content as string in my endpoint like as follows (screenshot attached).
I'm trying to convert this binary string to byte array and upload to S3 bucket.
I'm getting success response, But when i downloads the file / image it's looks like not an actual file.
Sample code:
#Override
public ServerlessOutput handleRequest(ServerlessInput serverlessInput, Context context) {
ServerlessOutput output = new ServerlessOutput();
String keyName = UUID.randomUUID().toString();
String content = serverlessInput.getBody();
byte[] encoded = this.toBinary(content).getBytes();
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(encoded.length);
metadata.setContentType(PNG_MIME);
s3.putObject(new PutObjectRequest(
ARTICLE_BUCKET_NAME,
keyName,
new ByteArrayInputStream(encoded),
metadata)
);
output.setBody("Successfully inserted article ");
}
private String toBinary(String data) {
byte[] bytes = data.getBytes();
StringBuilder binary = new StringBuilder();
for (byte b : bytes) {
int val = b;
for (int i = 0; i < 8; i++) {
binary.append((val & 128) == 0 ? 0 : 1);
val <<= 1;
}
binary.append(' ');
}
return binary.toString();
}

After storing image file in Google Cloud Storage Bucket, it show the lists of all files with size as "0 bytes"

Steps I've followed:
1) Store the images in Blobstore.(This is necessary)
2) Fetch the images from blobstore and save it in GCS bucket.
If I store the image file directly to GCS bucket then it works fine.
But I want to fetch it from Blobstore.
Map<String, List<BlobInfo>> blobsData = blobstoreService.getBlobInfos(req);
for (String key : blobsData.keySet())
{
for(BlobInfo blob:blobsData.get(key))
{
byte[] b = new byte[(int)blob.getSize()];
BlobstoreInputStream in = new BlobstoreInputStream(blob.getBlobKey());
in.read(b);
GcsService gcsService = GcsServiceFactory.createGcsService();
GcsFilename filename = new GcsFilename("casfilestorage", blob.getFilename());
GSFileOptionsBuilder builder = new GSFileOptionsBuilder()
.setAcl("public_read")
.setBucket(BUCKET_NAME)
.setKey(blob.getFilename())
.setMimeType(blob.getContentType());
AppEngineFile writableFile = fileService.createNewGSFile(builder.build());
boolean lock = true;
writeChannel = fileService.openWriteChannel(writableFile, lock);
os = Channels.newOutputStream(writeChannel);
UploadOptions uploadOptions = UploadOptions.Builder.withGoogleStorageBucketName("casgaestorage");
//String uploadUrl = blobstoreService.createUploadUrl("/serve", uploadOptions);
os.close();
writeChannel.closeFinally();
in.close();
}
}
Refer below code, Just pass your byte[] to write method:
GcsService f_ObjGcsService = GcsServiceFactory.createGcsService();
GcsFilename f_ObjFilename = new GcsFilename("BUCKET NAME", "FileName");
GcsFileOptions f_ObjOptions = new GcsFileOptions.Builder()
.mimeType("Content Type")
.acl("public-read")
.build();
GcsOutputChannel f_ObjWriteChannel = f_ObjGcsService.createOrReplace("Filename", f_ObjOptions);
PrintWriter f_ObjWriter = new PrintWriter(Channels.newWriter(f_ObjWriteChannel, "UTF8"));
f_ObjWriter.flush();
f_ObjWriteChannel.waitForOutstandingWrites();
f_ObjWriteChannel.write(ByteBuffer.wrap("Byte Array that u get from blob"));
f_ObjWriteChannel.close();

1MB quota limit for a blobstore object in Google App Engine?

I'm using App Engine (version 1.4.3) direct write the blobstore in order to save images.
when I try to store an image which is larger than 1MB I get the following Exception
com.google.apphosting.api.ApiProxy$RequestTooLargeException: The request to API call datastore_v3.Put() was too large.
I thought that the limit for each object is 2GB
Here is the Java code that stores the image
private void putInBlobStore(final String mimeType, final byte[] data) throws IOException {
final FileService fileService = FileServiceFactory.getFileService();
final AppEngineFile file = fileService.createNewBlobFile(mimeType);
final FileWriteChannel writeChannel = fileService.openWriteChannel(file, true);
writeChannel.write(ByteBuffer.wrap(data));
writeChannel.closeFinally();
}
Here is how I read and write large files:
public byte[] readImageData(BlobKey blobKey, long blobSize) {
BlobstoreService blobStoreService = BlobstoreServiceFactory
.getBlobstoreService();
byte[] allTheBytes = new byte[0];
long amountLeftToRead = blobSize;
long startIndex = 0;
while (amountLeftToRead > 0) {
long amountToReadNow = Math.min(
BlobstoreService.MAX_BLOB_FETCH_SIZE - 1, amountLeftToRead);
byte[] chunkOfBytes = blobStoreService.fetchData(blobKey,
startIndex, startIndex + amountToReadNow - 1);
allTheBytes = ArrayUtils.addAll(allTheBytes, chunkOfBytes);
amountLeftToRead -= amountToReadNow;
startIndex += amountToReadNow;
}
return allTheBytes;
}
public BlobKey writeImageData(byte[] bytes) throws IOException {
FileService fileService = FileServiceFactory.getFileService();
AppEngineFile file = fileService.createNewBlobFile("image/jpeg");
boolean lock = true;
FileWriteChannel writeChannel = fileService
.openWriteChannel(file, lock);
writeChannel.write(ByteBuffer.wrap(bytes));
writeChannel.closeFinally();
return fileService.getBlobKey(file);
}
The maximum object size is 2 GB but each API call can only handle a maximum of 1 MB. At least for reading, but I assume it may be the same for writing. So you might try to split your writing of the object into 1 MB chunks and see if that helps.
As Brummo suggested above if you split it into chunks < 1MB it works. Here's some code.
public BlobKey putInBlobStoreString(String fileName, String contentType, byte[] filebytes) throws IOException {
// Get a file service
FileService fileService = FileServiceFactory.getFileService();
AppEngineFile file = fileService.createNewBlobFile(contentType, fileName);
// Open a channel to write to it
boolean lock = true;
FileWriteChannel writeChannel = null;
writeChannel = fileService.openWriteChannel(file, lock);
// lets buffer the bitch
BufferedInputStream in = new BufferedInputStream(new ByteArrayInputStream(filebytes));
byte[] buffer = new byte[524288]; // 0.5 MB buffers
int read;
while( (read = in.read(buffer)) > 0 ){ //-1 means EndOfStream
ByteBuffer bb = ByteBuffer.wrap(buffer);
writeChannel.write(bb);
}
writeChannel.closeFinally();
return fileService.getBlobKey(file);
}

Categories

Resources