How to save pdf to Aws s3 - java

I am using PDFbox to merge two pdf files in my code, then I want to store the resultant (merged file) into AWS s3 bucket.
I was trying storing the pdf file to s3 directly without saving locally in my system, but I am not able to figure out any way to do it.
My code to merge two pdf-
PDFMergerUtility pdfMergerUtility = new PDFMergerUtility();
String FinalFileName= "Merged.pdf";
pdfMergerUtility.setDestinationFileName(FinalFileName);
pdfMergerUtility.addSource(FileOne);
pdfMergerUtility.addSource(FileTwo);
pdfMergerUtility.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
//To upload over s3
String fileNameIWantInS3 = "myfile.pdf";
s3.putObject(BucketName, fileNameIWantInS3, ??); //Stuck here
I don't want to make a file over my server instead I want to put it on s3 directly, how can I modify this code to upload Merged.pdf to s3 bucket.
Above code is just a part where I am stuck. FileOne and FileTwo I have created using File.createTempFile.
Entire idea is to merge two files and put the final file over s3 without making a physical copy of that over the server! Please help.

to upload the file you need to pass the byte array or input stream, the following use byte array:
public void saveFileToS3(String yourBucketName,String pathAws, String folderName) {
PDFMergerUtility pdfMergerUtility = new PDFMergerUtility();
String FinalFileName= "Merged.pdf";
pdfMergerUtility.setDestinationFileName(FinalFileName);
pdfMergerUtility.addSource(FileOne);
pdfMergerUtility.addSource(FileTwo);
pdfMergerUtility.mergeDocuments(MemoryUsageSetting.setupMainMemoryOnly());
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
pdfMergerUtility.setDestinationStream(byteArrayOutputStream);
AmazonS3 s3Client = AwsUtil.s3Authentication();
PutObjectRequest objectRequest = PutObjectRequest.builder()
.bucket(yourBucketName)
.key(folderName+"/"+pathAws)//pathAws: path on S3 where the file will be saved
.build();
//convert it into a byte array
byte[] pdfBytes = byteArrayOutputStream.toByteArray()
CompletableFuture<PutObjectResponse> future =
s3Client.putObject(objectRequest,AsyncRequestBody.fromBytes(pdfBytes));
future.whenComplete((resp, err) -> {
try {
if (resp != null) {
System.out.println("Object uploaded. Details: " + resp);
} else {
err.printStackTrace();
}
} finally {
s3Client.close();
}
});
future.join();
}

Related

Uploaded files duplicated in my project and AWS S3 bucket

I created Java program that can save files in AMAZON S3 storage - it works ok, but its saves files not only in S3 bucket, but also in my project directory.
Here is my code that saving files to S3. I suppose the reason why it saving in project directory also - is creation of file instance with specified path - File file = new File(timestamp + ".jpg"); But how can I avoid that and still set needed file name without saving it to the project directory?
public void saveFileToStorage(String url, Long timestamp, Integer deviceId) {
S3Repository repository = new S3Repository(bucketName);
File file = new File(timestamp + ".jpg");
try {
URL link = new URL(url);
Thread.sleep(1500);//wait until URL is ready for download
FileUtils.copyURLToFile(link, file);
repository.uploadFile(timestamp.toString(), file, deviceId.toString()+"/");
} catch (IOException | InterruptedException e) {
log.error(e.getMessage() + " - check thread sleep time!");
throw new RuntimeException(e);
}
}
Here is my upload method from repository:
public void uploadFile(String keyName, File file, String folder) {
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(0);
s3client.putObject(bucketName, folder, new ByteArrayInputStream(new byte[0]), metadata);
s3client.putObject(new PutObjectRequest(bucketName, folder+keyName, file));
}
It's quite common to do something similar to what you've done.
I personally like the PutObjectRquest builder.
S3Client client = S3Client.builder().build();
PutObjectRequest request = PutObjectRequest.builder()
.bucket("bucketName").key("fileName").build();
client.putObject(request, RequestBody.fromFile(new File("filePath")));
To address your problem what about using RequestBody.fromByteBuffer() instead of RequestBody.fromFile()?
Here you can find an example:
https://stackabuse.com/aws-s3-with-java-uploading-files-creating-and-deleting-s3-buckets/

Get file from GCS without downloading it locally

I have a simple Spring Boot microservice that takes care of uploading, retrieving and deleting images to/from Google Cloud Storage. I have the following code for the get request in my service:
public StorageObject getImage(String fileName) throws IOException {
StorageObject object = storage.objects().get(bucketName, fileName).execute();
File file = new File("./" + fileName);
FileOutputStream os = new FileOutputStream(file);
storage.getRequestFactory()
.buildGetRequest(new GenericUrl(object.getMediaLink()))
.execute()
.download(os);
object.set("file", file);
return object;
}
And this is my controller part:
#GetMapping("/get/image/{id}")
public ResponseEntity<byte[]> getImage(#PathVariable("id") Long id) {
try {
String fileName = imageService.findImageById(id);
StorageObject object = gcsService.getImage(fileName);
byte[] res = Files.toByteArray((File) object.get("file"));
return ResponseEntity.ok()
.contentType(MediaType.IMAGE_JPEG)
.body(res);
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException("No such file or directory");
}
}
It all works fine in terms of getting the image in the response, but my problem is that the images get downloaded at the root directory of the project too. Many images are going to be uploaded through this service so this is an issue. I only want to display the images in the response (as a byteArray), without having them download. I tried playing with the code but couldn't manage to get it to work as I want.
I'd suggest to instead stream the download, while skipping the FileChannel operation:
public static void streamObjectDownload(
String projectId, String bucketName, String objectName, String targetFile
) {
Storage storage = StorageOptions.newBuilder().setProjectId(projectId).build().getService();
try (ReadChannel reader = storage.reader(BlobId.of(bucketName, objectName));
FileChannel targetFileChannel = FileChannel.open(Paths.get(targetFile), StandardOpenOption.WRITE)) {
ByteStreams.copy(reader, targetFileChannel);
System.out.println(
"Downloaded object " + objectName
+ " from bucket " + bucketName
+ " to " + targetFile
+ " using a ReadChannel.");
}
} catch (IOException e) {
e.printStacktrace()
}
}
One can eg. obtain a FileChannel from a RandomAccessFile:
RandomAccessFile file = new RandomAccessFile(Paths.get(targetFile), StandardOpenOption.WRITE);
FileChannel channel = file.getChannel();
While the Spring framework similarly has a GoogleStorageResource:
public OutputStream getOutputStream() throws IOExceptionReturns the output stream for a Google Cloud Storage file.
Then convert from OutputStream to byte[] (this may be binary or ASCII data):
byte[] bytes = os.toByteArray();
Would it work for you to create Signed URLs in Cloud Storage to display your images? These URLs give access to storage bucket files for a limited time, and then expire, so you would rather not store temporary copies of the image locally as is suggested in this post.

How to extract tar files from amazonS3 bucket to another s3 in Java

I have tar files in a S3 bucket and I'm trying to untar them in another s3 bucket.
So far I got all the files in the destBucket but it seems that the putObject makes the files corrupted or nulls. How to read the whole file and write the whole buffered in the putObject ?
Here the code I am using:
TarArchiveInputStream tarInputStream = new TarArchiveInputStream(new BufferedInputStream(objectData));
TarArchiveEntry currentEntry;
while ((currentEntry = tarInputStream.getNextTarEntry()) != null) {
if (!currentEntry.isDirectory()) {
byte[] objectBytes = new byte[currentEntry.getSize()];
tarInputStream.read(objectBytes);
def entryName = currentEntry.getName()
def fileN = entryName.substring(entryName.lastIndexOf("/") + 1, entryName.length())
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(objectBytes.length);
metadata.setContentType("application/octet-stream");
s3Client.putObject(destbucket, packagePath + "untar_frames/" + fileN,
new ByteArrayInputStream(objectBytes), metadata);
}
}
Try using V2 version of the Amazon S3 Java API. I have not seen an issue using PutObject when you properly set the byte[] with valid data. Also, make sure that you properly setup the PutObjectRequest object. See this code example.
https://github.com/scmacdon/aws-doc-sdk-examples/blob/master/javav2/example_code/s3/src/main/java/com/example/s3/PutObject.java
If you are not familiar with AWS SDK for Java V2, please refer to this Quick Start.

How to get all pdf file from google storage cloud bucket in JAVA

In a bucket folder I can get pdf file from google storage. how can i download all files in ones from there.
Eg.for a single file.
public InputStream getBlobBytesArray() {
String bucketName ="XYZ";
String filename="abc.pdf";
Blob blob = getStorageInstance().get(BlobId.of(bucketName, filename));
return new ByteArrayInputStream(getStorageInstance().readAllBytes(blob.getBlobId()));
}
Inside the bucket I have create one folder where 5 pdf are store.
How to fetch them all'
public MultipartFile getMultiFileFromCloud() {
// TODO Auto-generated method stub
String bucketName = "XYZ";
Blob fileBlob = getStorageInstance().get(BlobId.of(bucketName, filepath));
MultipartFile multipartFile = new CommonsMultipartFile( new ByteArrayInputStream(fileBlob.getContent(BlobSourceOption.generationMatch())));
return null;
}
I have tried this nothing working.thanks in advance.
Google storage provide the way to find out all objects inside the folder.
public InputStream getBlobBytesArray() {
String bucketName ="XYZ";
Bucket bucket = storage.get(bucketName);
Page<Blob>blobs=bucket.list(Storage.BlobListOption.prefix("folderPath/1"));
}
Now this is working fine for me.
You can search file only by prefix of their full name (path included). If you look for a suffix, or a content type metadata, you need to fetch all the files and select with your code what you want (the *.pdf extension in the name for example).
Then you can't download them in one command, but only one by one.

How to store uploaded mulitple pdf file to a specific location in java?

i want to store uploaded file in a specific location in java. if i upload a.pdf then i want it to store this at "/home/rahul/doc/upload/". i went through some questions and answers of stack overflow but i am not satisfied with solutions.
i am working with Play Framework 2.1.2. i am not working with servlet.
i am uploading but it is storing file into temp directory but i want that file store into a folder as not a temp file i want that file like a.pdf in folder not like temp file.
public static Result upload() {
MultipartFormData body = request().body().asMultipartFormData();
FilePart filePart1 = body.getFile("filePart1");
File newFile1 = new File("path in computer");
File file1 = filePart1.getFile();
InputStream isFile1 = new FileInputStream(file1);
byte[] byteFile1 = IOUtils.toByteArray(isFile1);
FileUtils.writeByteArrayToFile(newFile1, byteFile1);
isFile1.close();
}
but i am not satisfied with this solution and i am uploading multiple doc files.
for eg. i upload one doc ab.docx then after upload it is storing temp directory and file is this:
and it's location is this: /tmp/multipartBody5886394566842144137asTemporaryFile
but i want this: /upload/ab.docx
tell me some solution to fix this.
Everything's correct as a last step you need to renameTo the temporary file into your upload folder, you don't need to play around the streams it's as simple as:
public static Result upload() {
Http.MultipartFormData body = request().body().asMultipartFormData();
FilePart upload = body.getFile("picture");
if (upload != null) {
String targetPath = "/your/target/upload-dir/" + upload.getFilename();
upload.getFile().renameTo(new File(targetPath));
return ok("File saved in " + targetPath);
} else {
return badRequest("Something Wrong");
}
}
BTW you should implement some checking if targetPath doesn't exist to prevent errors and/or overwrites. Typical approach is incrementing the file name if file with the same name already exists, for an example sending a.pdf three times should save the files as a.pdf, a_01.pdf, a_02.pdf, etc.
i just completed it. My solution is working fine.
My solution of uploading multiple files is :
public static Result up() throws IOException{
MultipartFormData body = request().body().asMultipartFormData();
List<FilePart> resourceFiles=body.getFiles();
InputStream input;
OutputStream output;
File part1;
String prefix,suffix;
for (FilePart picture:resourceFiles) {
part1 =picture.getFile();
input= new FileInputStream(part1);
prefix = FilenameUtils.getBaseName(picture.getFilename());
suffix = FilenameUtils.getExtension(picture.getFilename());
part1=new File("/home/rahul/Documents/upload",prefix+"."+suffix);
part1.createNewFile();
output = new FileOutputStream(part1);
IOUtils.copy(input, output);
Logger.info("Uploaded file successfully saved in " + part1.getAbsolutePath());
}

Categories

Resources