How copy an existing object to same path Amazon S3

How copy an existing object to same path Amazon S3 - java

Hi I want to copy an existing object to same path in AWS S3 and I am getting following exception
This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes
I am using Apache camel S3, how can i resolve this. After searching, i found there is a request header which we can use to replace the existing file but it is not working
// multiple other attempts also present, I am not sure which header will work
exchange.`in`.headers[AWS2S3Constants.METADATA] = mutableMapOf(
"x-amz-metadata-directive" to "REPLACE",
"x-amz-meta-directive" to "REPLACE",
"metadata-directive" to "REPLACE",
"MetadataDirective" to "REPLACE"
)
I have logged in the request.
Sending Request: DefaultSdkHttpFullRequest(httpMethod=PUT, protocol=https, host=, port=443, encodedPath=, headers=[amz-sdk-invocation-id, User-Agent, x-amz-copy-source, x-amz-meta-directive, x-amz-meta-metadata-directive, x-amz-meta-MetadataDirective, x-amz-meta-x-amz-metadata-directive], queryParameters=[])
But it is not working. how can i copy an existing object to same path without getting this error.

I end up changing the filename by suffixing it with a timestamp. Now I am no longer getting the error.
But I think, there should be some way to copy existing objects via API, since I am able to do same via aws-cli

Did you look at the copyObject operation in the Apache Camel AWS2 S3 component? https://camel.apache.org/components/3.17.x/aws2-s3-component.html#_s3_producer_operation_examples
There is an example related to headers needed to make the Copy Object operation works.

$sourceBucket = '*** Your Source Bucket Name ***';
$sourceKeyname = '*** Your Source Object Key ***';
$targetBucket = '*** Your Target Bucket Name ***';
$s3 = new S3Client([
'version' => 'latest',
'region' => 'us-east-1'
]);
// Copy an object.
$s3->copyObject([
'Bucket' => $targetBucket,
'Key' => "{$sourceKeyname}-copy",
'CopySource' => "{$sourceBucket}/{$sourceKeyname}",
]);

Related

Uploading large file to SharePoint with metadata using Microsoft Graph SDK for java

Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me, but adding also metadata on an upload seems to be hard
I tried the to add the metadata inside the DriveItemUploadableProperties, because I didn't find any hints where the right place should be
DriveItemUploadableProperties value = new DriveItemUploadableProperties();
value.additionalDataManager().put("Client", new JsonPrimitive("Test ABC"));
var driveItemCreateUploadSessionParameterSet = DriveItemCreateUploadSessionParameterSet.newBuilder().withItem(value);
UploadSession uploadSession = graphClient.sites(SPValues.SITE_ID).lists(SPValues.LIST_ID).drive().root().itemWithPath(path).createUploadSession(driveItemCreateUploadSessionParameterSet.build()).buildRequest().post();
LargeFileUploadTask<DriveItem> largeFileUploadTask = new LargeFileUploadTask<>(uploadSession, graphClient, fileStream, streamSize, DriveItem.class);
LargeFileUploadResult<DriveItem> upload = largeFileUploadTask.upload(customConfig);
This results in a 400 : Bad Request response
How can I add metadata on an upload the right way?

AFAIK, you cannot add metadata while uploading to Sharepoint. You will have to make two separate requests, one to upload the file, and one to add additional metadata to the file that you just uploaded.
Before adding your own custom metadata, you must register the facets / schema to OneDrive. Refer to this doc :
https://learn.microsoft.com/en-us/onedrive/developer/rest-api/concepts/custom-metadata-facets?view=odsp-graph-online
But you should be aware that because custom facets are a feature in preview, at the time of this post you have to literally contact an MS email and get the custom facet manually approved, there is no automatic API to do this unfortunately.
If you somehow manage to get the custom facet approved :
DriveItemUploadableProperties has preset fields such as filename, size, etc. meant to represent the upload task and basic details about the file, there are no options to add additional metadata to it. Refer to the documentation for DriveItemUploadableProperties :
https://learn.microsoft.com/en-us/graph/api/resources/driveitemuploadableproperties?view=graph-rest-1.0
I assume that when you say, "Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me", you are able to successfully upload the file and obtain the item ID in the response from the uploaded file. You can use the item ID to update the metadata of the file via a second request. Specifically, refer to the update driveitem here :
https://learn.microsoft.com/en-us/graph/api/driveitem-update?view=graph-rest-1.0&tabs=http
GraphServiceClient graphClient = GraphServiceClient.builder().authenticationProvider( authProvider ).buildClient();
DriveItem driveItem = new DriveItem();
driveItem.name = "new-file-name.docx";
graphClient.me().drive().items("{item-id}")
.buildRequest()
.patch(driveItem);
Edit :
As additional information, you can use a ListItem rather than a DriveItem resource and input custom fields there. However, you should be aware that unlike custom facets that I mention above, custom metadata stored in these fields are not indexed and is not meant to be queried / filtered on large datasets, which is the most common use case for metadata. When querying for these fields you must include the
Prefer : HonorNonIndexedQueriesWarningMayFailRandomly
in the request header, and as the header says you should be aware that the query may fail randomly in large datasets.

Amazon S3 Client NOT listing all folders in the bucket

I'm trying to list all so-called folders and sub-folders in an s3 bucket.
Now, as I am trying to list all the folders in a path recursively I am not using withDelimeter() function.
All the so-called folder names should end with / and this is my logic to list all the folders and sub-folders.
Here's the scala code (Intentionally not pasting the catch code here):
val awsCredentials = new BasicAWSCredentials(awsKey, awsSecretKey)
val client = new AmazonS3Client(awsCredentials)
def listFoldersRecursively(bucketName: String, fullPath: String): List[String] = {
try {
val objects = client.listObjects(bucketName).getObjectSummaries
val listObjectsRequest = new ListObjectsRequest()
.withPrefix(fullPath)
.withBucketName(bucketName)
val folderPaths = client
.listObjects(listObjectsRequest)
.getObjectSummaries()
.map(_.getKey)
folderPaths.filter(_.endsWith("/")).toList
}
}
Here's the structure of my bucket through an s3 client
Here's the list I am getting using this scala code
Without any apparent pattern, many folders are missing from the list of retrieved folders.
I did not use
client.listObjects(listObjectsRequest).getCommonPrefixes.toList
because it was returning empty list for some reason.
P.S: Couldn't add photos in post directly because of being a new user.

Without any apparent pattern, many folders are missing from the list of retrieved folders.
Here's your problem: you are assuming there should always be objects with keys ending in / to symbolize folders.
This is an incorrect assumption. They will only be there if you created them, either via the S3 console or the API. There's no reason to expect them, as S3 doesn't actually need them or use them for anything, and the S3 service does not create them spontaneously, itself.
If you use the API to upload an object with key foo/bar.txt, this does not create the foo/ folder as a distinct object. It will appear as a folder in the console for convenience, but it isn't there unless at some point you deliberately created it.
Of course, the only way to upload such an object with the console is to "create" the folder unless it already appears -- but appears in the console does not necessarily equate to exists as a distinct object.
Filtering on endsWith("/") is invalid logic.
This is why the underlying API includes CommonPrefixes with each ListObjects response if delimiter and prefix are specified. This is a list of the next level of "folders", which you have to recursively drill down into in order to find the next level.
If you specify a prefix, all keys that contain the same string between the prefix and the first occurrence of the delimiter after the prefix are grouped under a single result element called CommonPrefixes. If you don't specify the prefix parameter, the substring starts at the beginning of the key. The keys that are grouped under the CommonPrefixes result element are not returned elsewhere in the response.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html
You need to access this functionality with whatever library you or using, or, you need to iterate the entire list of keys and discover the actual common prefixes on / boundaries using string splitting.

Well, in case someone faces the same problem in future, the alternative logic I used is as suggested by #Michael above, I iterated through all the keys, splat them at last occurrence of /. The first index of the returned list + / was the key of a folder, appended it to another list. At the end, returned the unique list I was appending into. This gave me all the folders and sub-folders in a certain prefix location.
Note that I didn't use CommonPrefixes because I wasn't using any delimiter and that's because I didn't want the list of folders at a certain level but instead recursively get all the folders and sub-folders
def listFoldersRecursively(bucketName: String, fullPath: String): List[String] = {
try {
val objects = client.listObjects(bucketName).getObjectSummaries
val listObjectsRequest = new ListObjectsRequest()
.withPrefix(fullPath)
.withBucketName(bucketName)
val folderPaths = client.listObjects(listObjectsRequest)
.getObjectSummaries()
.map(_.getKey)
.toList
val foldersList: ArrayBuffer[String] = ArrayBuffer()
for (folderPath <- folderPaths) {
val split = folderPath.splitAt(folderPath.lastIndexOf("/"))
if (!split._1.equals(""))
foldersList += split._1 + "/"
}
foldersList.toList.distinct
P.S: Catch block is intentionalyy missing due to irrelevancy.

The listObjects function (and others) is paginating, returning up to 100 entries every time.
From the doc:
Because buckets can contain a virtually unlimited number of keys, the
complete results of a list query can be extremely large. To manage
large result sets, Amazon S3 uses pagination to split them into
multiple responses. Always check the ObjectListing.isTruncated()
method to see if the returned listing is complete or if additional
calls are needed to get more results. Alternatively, use the
AmazonS3Client.listNextBatchOfObjects(ObjectListing) method as an easy
way to get the next page of object listings.

Upload an object to Amazon S3 with custom version id

I want to upload an object to Amazon versioned bucket (using Java AWS SDK) and set a custom version to this object (goal is to set the same version to all objects, uploaded at once)
PutObjectResult por = amazonS3Client.putObject(...);
por.setVersionId("custom_version");
So, is it the right way to set a version to the uploaded object?
Does this code lead to 2 separate requests to Amazon?
What if Internet is broken while por.setVersionId(..) is being called?
Why por.setVersionId(..) does not throw an exception such as SdkClientException if this method really is trying to set a version ID on Amazon server?

setVersionId would be something the SDK library itself uses to populate the versionId returned by the service when the object is created, so that you can retrieve it if you want to know what it is.
Version IDs in S3 are system-generated opaque strings that uniquely identify a specific version of an object. You can't assign them.
The documentation uses some unfortunate examples like "111111" and "222222," which do not resemble real version-ids. There's a better example further down the page, where you'll find this:
Unique version IDs are randomly generated, Unicode, UTF-8 encoded, URL-ready, opaque strings that are at most 1024 bytes long. An example version ID is 3/L4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo.
Only Amazon S3 generates version IDs.
They cannot be edited.
You don't get an error here because all this method does is set the versionId inside the PutObjectResult object in local memory after the upload has finished. It succeeds, but serves no purpose.
To store user-defined metadata with objects, such as your release/version-id, you'd need to use object metadata (x-amz-meta-*) or the new object tagging feature in S3.

Deleting AWS S3 Resource with Resource URL - Java SDK

Is there a way to delete a resource from AWS S3 using the java sdk by URL?
I know you can delete a resource using a bucket name and keyName like this:
s3client.deleteObject(new DeleteObjectRequest(bucketName, keyName));
The issue is that I only have access to the resourceURL, so I would need to manipulate the string to extract the bucketname and keyname.
But if there was a way to delete by passing the url would be much cleaner.

There doesn't appear to be a way to simply pass the URL.
There's this, though:
AmazonS3URI
public AmazonS3URI(String str)
Creates a new AmazonS3URI by parsing the given string. String will be URL encoded before generating the URI.
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3URI.html
You can call getKey and getBucket on it to extract the strings you need. It's still messy, but at least it looks like you don't have to write your own parser.

Get Google Cloud Storage File from ObjectName

I'm migrating my GAE app from the deprecated File API to Google Cloud Storage Client Library.
I used to persist the blobKey, but since there is partial support for it (as specified here) from now on I'll have to persist the object name.
Unfortunately the object name that comes from the GCS looks more or less like this
/gs/bucketname/819892hjd81dh19gf872g8211
as you can see, it also contains the bucket name
Here's the issue, every time I need to get the file for further processing (or to serve it in a servlet) I need to create an instance of GcsFileName(bucketName, objectName) which gives me something like
/bucketName/gs/bucketName/akahsdjahslagfasgfjkasd
which (of course) doesn't work.
so. my question is:
- how can I generate a GcsFileName form the objectName?
UPDATE
I tried using the objectName as BlobKey. But it just doesn't work :(
InputStream is = new BlobstoreInputStream(blobstoreService.createGsBlobKey("/gs/bucketName/akahsdjahslagfasgfjkasd"));
I got the usual answer
BlobstoreInputStream received an invalid blob key
How do I get the file using the ObjectName???

If you have persisted and retrieved e.g the string String objname worth e.g "/gs/bucketname/819892hjd81dh19gf872g8211", you could split it on "/" (String[] pieces = objname.split("/")) and use the pieces appropriately in the call to GcsFileName.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.