AWS S3 Java SDK not copying file to folder - java

Java 8 here. I have an S3 bucket (myapp-bucket) with the following folder substructure:
/artwork
/upload
/staging
Here's what the myapp-bucket/artwork folder looks like:
I have software that uploads image files to the /upload folder, and then another app that I'm writing is supposed to process that image and copy it over to the /staging folder. The code I have to perform this S3 copy is as follows:
CopyObjectRequest copyObjectRequest = new CopyObjectRequest(
bucketName,
objectKey,
destinationBucket,
destinationKey
);
log.info(
"copying object from s3://{}/{} to s3://{}/{}",
bucketName,
objectKey,
destinationBucket,
destinationKey
);
// Constructed via
AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard().withRegion(Regions.US_EAST_1).build();
amazonS3.copyObject(copyObjectRequest);
When this runs, I see the following log output I see is:
[main] DEBUG com.me.myapp.ImageProcessor - copying object from s3://myapp-bucket/artwork/upload/rlj_amp244001_270.jpeg to s3://myapp-bucket/artwork/staging
But the final state of the myapp-bucket/artwork folder looks like this:
So it looks like, instead of copying myapp-bucket/artwork/upload/<theImageFile> to the myapp-bucket/artwork/staging/ directory, its creating a new file directly under /artwork and naming it staging, and likely copying the binary contents of the image into that staging file. Where am I going awry?! I just want to end up with myapp-bucket/artwork/staging/<theImageFile> -- thanks in advance!

S3 doesn't have folders/directories - it's just sugar over a flat storage. So your destinationKey should look like artwork/staging/rlj_amp244001_270.jpeg

Related

How can I delete all the files one by one after downloading it from the Azure Blob folder using Java

I am downloading some files from the Azure blob to my local directory for processing it, and while downloading I need to delete them one by one, so that if for some reason my process fails we can have those files which are not downloaded. I am able to download the files to my local, but when I am trying to delete them from the folder the entire folder is getting deleted. My folder structure is
image-source(container name)
HDF(folder)
a1.tif
a2.tif
a3.tif
CFO(folder)
b1.tif
b2.tif
JPO(folder)
h1.tif
h2.tif
image source is the container name under which I have 3 folders, HDF, CFO, JPO, which then contain some image files. I have to delete each image files while downloading it, but in my code the entire folder is also deleted after all the files are gone.
And when I am getting the files from the blob folder they are coming in the format of HDF/a1.tif, HDF/a2.tif
I am posting the code I am using for downloading the image files and also what I am doing to delete them
try{
CloudStorageAccount storageAccount = CloudStorage.parse(storageConnect);
CloudBlobClient blobClient = storageAccount.createCloudBlobClient();
CloudBlobContainer myCloudBlobContainer = blobClient.getContainerReference(Constant.Container_Name);
Iterable<ListBlobItem> blobs = myCloudBlobContainer.listBlobs();
for(ListBlobItem blob: blobs){
if(blob instanceof CloudBlobDirectory){
CloudBlobDirectory directory = (CloudBlobDirectory) blob;
Iterable<ListBlobItem> fileBlobs = directory.listBlobs();
String actualFileName = "";
String actualDir = "";
CloudBlob cloudBlob = null;
for(ListBlobItem fileBlob: fileBlobs){
if(fileBlob instanceof CloudBlob){
cloudBlob = (CloudBlob)fileBlob;
if(cloudBlob.getName().contains(".tif")){
log.info("File Name with directory: ",cloudBlob.getName());
actualFileName = cloudBlob.getName().split("/")[1];
actualDir = cloudBlob.getName().split("/")[0];
Files.createDirectories(Paths.get(actualDir));
cloudBlob.download(new FileOutPutStream(actualDir+"\\"+actualFileName));
log.info("Download File from directory{},actualDir,actualFileName);
deleteFileAfterDownload(cloudBlob);
}
}
}
private void deleteFileAfterDownload(CloudBlob cloudBlob){
cloudBlob.deleteIfExists();
}
After I execute the code, the entire folder along with the files are removed, but I don't want that. I want to delete files one by one and retain the folder as it is.
After I execute the code, the entire folder along with the files are
removed, but I don't want that. I want to delete files one by one and
retain the folder as it is.
Unfortunately, it is not possible with Azure Blob Storage as the folders there are not real folders, they are virtual. If we take the example of your data, the name of the blob is actually HDF/a1.tif where HDF is the virtual folder. So once you delete all blobs in that virtual folder, that virtual folder is also gone.
If you want proper folder hierarchy, you should take a look at Azure Data Lake Storage (Gen 2) which is built on top of Azure Blob Storage or Azure File Storage. Both of these services has first class support for folders.

Create multiple empty directories in Amazon S3 using java

I am new to S3 and I am trying to create multiple directories in Amazon S3 using java by only making one call to S3.
I could only come up with this :-
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(0);
InputStream emptyContent = new ByteArrayInputStream(new byte[0]);
PutObjectRequest putObjectRequest = new PutObjectRequest(bucket,
"test/tryAgain/", emptyContent, metadata);
s3.putObject(putObjectRequest);
But the problem with this while uploading 10 folders (when the key ends with "/" in the console we can see the object as a folder ) is that I have to make 10 calls to S3.
But I want to do a create all the folders at once like we do a batch delete using DeleteObjectsRequest.
Can anyone please suggest me or help me how to solve my problem ?
Can you be a bit more specific as to what you're trying to do (or avoid doing)?
If you're primarily concerned with the cost per PUT, I don't think there is a way to batch 'upload' a directory with each file being a separate key and avoid that cost. Each PUT (even in a batch process) will cost you the price per PUT.
If you're simply trying to find a way to efficiently and recursively upload a folder, check out the uploadDirectory() method of TransferManager.
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#uploadDirectory-java.lang.String-java.lang.String-java.io.File-boolean-
public MultipleFileUpload uploadDirectory(String bucketName,
String virtualDirectoryKeyPrefix,
File directory,
boolean includeSubdirectories)

How to download files from Amazon S3?

I have a folder named output inside a bucket named BucketA. I have a list of files in output folder. How do I download them to my local machine using AWS Java SDK ?
Below is my code:
AmazonS3Client s3Client = new AmazonS3Client(credentials);
File localFile = new File("/home/abc/Desktop/AmazonS3/");
s3Client.getObject(new GetObjectRequest("bucketA", "/bucketA/output/"), localFile);
And I got the error:
AmazonS3Exception: The specified key does not exist.
Keep in mind that S3 is not a filesystem, but it is an object store. There's a huge difference between the two, one being that directory-style activities simply won't work.
Suppose you have an S3 bucket with two objects in it:
/path/to/file1.txt
/path/to/file2.txt
When working with these objects you can't simply refer to /path/to/ like you can when working with files in a filesystem directory. That's because /path/to/ is not a directory but just part of a key in a very large hash table. This is why the error message indicates an issue with a key. These are not filename paths but keys to objects within the object store.
In order to copy all the files in a location like /path/to/ you need to perform it in multiple steps. First, you need to get a listing of all the objects whose keys begin with /path/to, then you need to loop through each individual object and copy them one by one.
Here is a similar question with an answer that shows how to download multiple files from S3 using Java.
I know this question was asked longtime ago, but still this answer might help some one.
You might want to use something like this to download objects from S3
new ListObjectsV2Request().withBucketName("bucketName").withDelimiter("delimiter").withPrefix("path/to/image/");
as mentioned in the S3 doc
delimiter be "/" and prefix be your "folder like structure".
You can use the predefined classes for upload directory and download directory
For Download
MultipleFileDownload xfer = xfer_mgr.downloadDirectory(
bucketName, key, new File("C:\\Users\\miracle\\Deskto\\Downloads"));
For Upload
MultipleFileUpload xfer = xfer_mgr.uploadDirectory(bucketName, key,Dir,true);
The error message means that the bucket (in this case "bucketA") does not contain a file with the name you specified (in this case "/bucketA/output/").
When you specify the key, do not include the bucket name in the key. S3 supports "folders" in the key, which are delimited with "/", so you probably do not want to try to use keys that end with "/".
If your bucket "bucketA" contains a file called "output", you probably want to say
new GetObjectRequest("bucketA", "output")
If this doesn't work, other things to check:
Do the credentials you are using have permission to read from the bucket?
Did you spell all the names correctly?
You might want to use listObjects("bucketA") to verify what the bucket actually contains (as seen with the credentials you are using).

Converting MultipartFile to java.io.File without copying to local machine

I have a Java Spring MVC web application. From client, through AngularJS, I am uploading a file and posting it to Controller as webservice.
In my Controller, I am gettinfg it as MultipartFile and I can copy it to local machine.
But I want to upload the file to Amazone S3 bucket. So I have to convert it to java.io.File. Right now what I am doing is, I am copying it to local machine and then uploading to S3 using jets3t.
Here is my way of converting in controller
MultipartHttpServletRequest mRequest=(MultipartHttpServletRequest)request;
Iterator<String> itr=mRequest.getFileNames();
while(itr.hasNext()){
MultipartFile mFile=mRequest.getFile(itr.next());
String fileName=mFile.getOriginalFilename();
fileLoc="/home/mydocs/my-uploads/"+date+"_"+fileName; //date is String form of current date.
Then I am using FIleCopyUtils of SpringFramework
File newFile = new File(fileLoc);
// if the directory does not exist, create it
if (!newFile.getParentFile().exists()) {
newFile.getParentFile().mkdirs();
}
FileCopyUtils.copy(mFile.getBytes(), newFile);
So it will create a new file in the local machine. That file I am uplaoding in S3
S3Object fileObject = new S3Object(newFile);
s3Service.putObject("myBucket", fileObject);
It creates file in my local system. I don't want to create.
Without creating a file in local system, how to convert a MultipartFIle to java.io.File?
MultipartFile, by default, is already saved on your server as a file when user uploaded it.
From that point - you can do anything you want with this file.
There is a method that moves that temp file to any destination you want.
http://docs.spring.io/spring/docs/3.0.x/api/org/springframework/web/multipart/MultipartFile.html#transferTo(java.io.File)
But MultipartFile is just API, you can implement any other MultipartResolver
http://docs.spring.io/spring/docs/3.0.x/api/org/springframework/web/multipart/MultipartResolver.html
This API accepts input stream and you can do anything you want with it. Default implementation (usually commons-multipart) saves it to temp dir as a file.
But other problem stays here - if S3 API accepts a file as a parameter - you cannot do anything with this - you need a real file. If you want to avoid creating files at all - create you own S3 API.
The question is already more than one year old, so I'm not sure if the jets35 link provided by the OP had the following snippet at that time.
If your data isn't a File or String you can use any input stream as a data source, but you must manually set the Content-Length.
// Create an object containing a greeting string as input stream data.
String greeting = "Hello World!";
S3Object helloWorldObject = new S3Object("HelloWorld2.txt");
ByteArrayInputStream greetingIS = new ByteArrayInputStream(greeting.getBytes());
helloWorldObject.setDataInputStream(greetingIS);
helloWorldObject.setContentLength(
greeting.getBytes(Constants.DEFAULT_ENCODING).length);
helloWorldObject.setContentType("text/plain");
s3Service.putObject(testBucket, helloWorldObject);
It turns out you don't have to create a local file first. As #Boris suggests you can feed the S3Object with the Data Input Stream, Content Type and Content Length you'll get from MultipartFile.getInputStream(), MultipartFile.getContentType() and MultipartFile.getSize() respectively.
Instead of copying it to your local machine, you can just do this and replace the file name with this:
File newFile = new File(multipartFile.getOriginalName());
This way, you don't have to have a local destination create your file
if you are try to use in httpentity check my answer here
https://stackoverflow.com/a/68022695/7532946

Uploading files to S3 using AmazonS3Client.java api

I am using AmazonS3Client.java to upload files to S3 from my application. I am using the putObject method to upload the file
val putObjectRequest = new PutObjectRequest(bucketName, key, inputStream, metadata)
val acl = CannedAccessControlList.Private
putObjectRequest.setCannedAcl(acl)
s3.putObject(putObjectRequest)
This works for buckets at the topmost level in my S3 account. Now, suppose i want to upload the file to a sub-bucket for example bucketB which is inside bucketA . How should i specify the bucket name for bucketB ?
Thank You
It is admittedly somewhat surprising, but there is no such thing as a "sub-bucket" in S3. All buckets are top-level. The structures inside buckets that you see in the S3 admin console or other UIs are called "folders", but even they don't really exist! You can't directly create or destroy folders, for instance, or set any attributes on them. Folders are purely a presentation-level convention for viewing the underlying flat set of objects in your bucket. That said, it's pretty easy to split your objects into (purely non-existent) folders. Just give them heirarchical names, with each level separated by a "/".
val putObjectRequest = new PutObjectRequest(bucketName, topFolderName +"/" + subFolderName+ "/" +key, inputStream, metadata)
Trying using putObjectRequest.setKey("folder")

Categories

Resources