I am trying to upload a file in amazon S3 storage using java sdk, enabling a explicit expiration date for the object using objectMetadata. when I run the program it uploads to S3 and sets the expiration date to object metadata as expected but eventually it doesn't seem to be deleted after the expiration date. I am not sure where i am doing wrong. Below is the code snippet i used to set the object metadata.
PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, file);
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setHttpExpiresDate(new DateTime().plusDays(1).toDate());
putObjectRequest.setMetadata(objectMetadata);
return s3.putObject(putObjectRequest);
I have been going through some of the Amazon documentation https://docs.aws.amazon.com/AmazonS3/latest/dev/manage-lifecycle-using-java.html which says to set the BucketLifeCycle Configuration rule. I am not sure whether if i apply this rule will it be applied to all folders and objects under this bucket or only the objects i upload using this rule through my java program?
Please suggest, Thanks in advance !
Following the documentation you can’t directly set expiration date for particular object.
To solve this problem you can:
Define lifecycle rule for a bucket(remove bucket with objects after number of days)
Define lifecycle rule for bucket to remove objects with specific tag or prefix after numbers of days
To creating a rules use documentation:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-set-lifecycle-configuration-intro.html
Related
Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me, but adding also metadata on an upload seems to be hard
I tried the to add the metadata inside the DriveItemUploadableProperties, because I didn't find any hints where the right place should be
DriveItemUploadableProperties value = new DriveItemUploadableProperties();
value.additionalDataManager().put("Client", new JsonPrimitive("Test ABC"));
var driveItemCreateUploadSessionParameterSet = DriveItemCreateUploadSessionParameterSet.newBuilder().withItem(value);
UploadSession uploadSession = graphClient.sites(SPValues.SITE_ID).lists(SPValues.LIST_ID).drive().root().itemWithPath(path).createUploadSession(driveItemCreateUploadSessionParameterSet.build()).buildRequest().post();
LargeFileUploadTask<DriveItem> largeFileUploadTask = new LargeFileUploadTask<>(uploadSession, graphClient, fileStream, streamSize, DriveItem.class);
LargeFileUploadResult<DriveItem> upload = largeFileUploadTask.upload(customConfig);
This results in a 400 : Bad Request response
How can I add metadata on an upload the right way?
AFAIK, you cannot add metadata while uploading to Sharepoint. You will have to make two separate requests, one to upload the file, and one to add additional metadata to the file that you just uploaded.
Before adding your own custom metadata, you must register the facets / schema to OneDrive. Refer to this doc :
https://learn.microsoft.com/en-us/onedrive/developer/rest-api/concepts/custom-metadata-facets?view=odsp-graph-online
But you should be aware that because custom facets are a feature in preview, at the time of this post you have to literally contact an MS email and get the custom facet manually approved, there is no automatic API to do this unfortunately.
If you somehow manage to get the custom facet approved :
DriveItemUploadableProperties has preset fields such as filename, size, etc. meant to represent the upload task and basic details about the file, there are no options to add additional metadata to it. Refer to the documentation for DriveItemUploadableProperties :
https://learn.microsoft.com/en-us/graph/api/resources/driveitemuploadableproperties?view=graph-rest-1.0
I assume that when you say, "Uploading a large file to SharePoint Online (Document library) via the MS Graph SDK (Java) works for me", you are able to successfully upload the file and obtain the item ID in the response from the uploaded file. You can use the item ID to update the metadata of the file via a second request. Specifically, refer to the update driveitem here :
https://learn.microsoft.com/en-us/graph/api/driveitem-update?view=graph-rest-1.0&tabs=http
GraphServiceClient graphClient = GraphServiceClient.builder().authenticationProvider( authProvider ).buildClient();
DriveItem driveItem = new DriveItem();
driveItem.name = "new-file-name.docx";
graphClient.me().drive().items("{item-id}")
.buildRequest()
.patch(driveItem);
Edit :
As additional information, you can use a ListItem rather than a DriveItem resource and input custom fields there. However, you should be aware that unlike custom facets that I mention above, custom metadata stored in these fields are not indexed and is not meant to be queried / filtered on large datasets, which is the most common use case for metadata. When querying for these fields you must include the
Prefer : HonorNonIndexedQueriesWarningMayFailRandomly
in the request header, and as the header says you should be aware that the query may fail randomly in large datasets.
I want to upload an object to Amazon versioned bucket (using Java AWS SDK) and set a custom version to this object (goal is to set the same version to all objects, uploaded at once)
PutObjectResult por = amazonS3Client.putObject(...);
por.setVersionId("custom_version");
So, is it the right way to set a version to the uploaded object?
Does this code lead to 2 separate requests to Amazon?
What if Internet is broken while por.setVersionId(..) is being called?
Why por.setVersionId(..) does not throw an exception such as SdkClientException if this method really is trying to set a version ID on Amazon server?
setVersionId would be something the SDK library itself uses to populate the versionId returned by the service when the object is created, so that you can retrieve it if you want to know what it is.
Version IDs in S3 are system-generated opaque strings that uniquely identify a specific version of an object. You can't assign them.
The documentation uses some unfortunate examples like "111111" and "222222," which do not resemble real version-ids. There's a better example further down the page, where you'll find this:
Unique version IDs are randomly generated, Unicode, UTF-8 encoded, URL-ready, opaque strings that are at most 1024 bytes long. An example version ID is 3/L4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo.
Only Amazon S3 generates version IDs.
They cannot be edited.
You don't get an error here because all this method does is set the versionId inside the PutObjectResult object in local memory after the upload has finished. It succeeds, but serves no purpose.
To store user-defined metadata with objects, such as your release/version-id, you'd need to use object metadata (x-amz-meta-*) or the new object tagging feature in S3.
I am trying to update the content of a Google Doc file with the content of another Google Doc file. The reason I don't use the copy method of the API is because that creates another file with another ID. My goal is to keep the current ID of the file. This is a code snippet which unfortunately does nothing:
com.google.api.services.drive.Drive.Files.Get getDraft = service.files().get(draftID);
File draft = driveManager.getFileBackoffExponential(getDraft);
com.google.api.services.drive.Drive.Files.Update updatePublished = service.files().update(publishedID, draft);
driveManager.updateFileBackoffExponential(updatePublished);
The two backoffExponential functions just launch the execute method on the object.
Googling around I found out that the update method offers another constructor:
public Update update(java.lang.String fileId, com.google.api.services.drive.model.File content, com.google.api.client.http.AbstractInputStreamContent mediaContent)
Thing is, I have no idea how to retrieve the mediaContent of a Google file such as a Google Doc.
The last resort could be a Google Apps Script but I'd rather avoid that since it's awfully slow and unreliable.
Thank you.
EDIT: I am using Drive API v3.
Try the Google Drive REST update.
Updates a file's metadata and/or content with patch semantics.
This method supports an /upload URI and accepts uploaded media with
the following characteristics:
Maximum file size: 5120GB Accepted Media MIME types: /*
To download a Google File in the format that's usable, you need to specify the mime-type. Since you're using Spreadsheets, you can try application/vnd.openxmlformats-officedocument.spreadsheetml.sheet. Link to Download files for more info.
I'm migrating my GAE app from the deprecated File API to Google Cloud Storage Client Library.
I used to persist the blobKey, but since there is partial support for it (as specified here) from now on I'll have to persist the object name.
Unfortunately the object name that comes from the GCS looks more or less like this
/gs/bucketname/819892hjd81dh19gf872g8211
as you can see, it also contains the bucket name
Here's the issue, every time I need to get the file for further processing (or to serve it in a servlet) I need to create an instance of GcsFileName(bucketName, objectName) which gives me something like
/bucketName/gs/bucketName/akahsdjahslagfasgfjkasd
which (of course) doesn't work.
so. my question is:
- how can I generate a GcsFileName form the objectName?
UPDATE
I tried using the objectName as BlobKey. But it just doesn't work :(
InputStream is = new BlobstoreInputStream(blobstoreService.createGsBlobKey("/gs/bucketName/akahsdjahslagfasgfjkasd"));
I got the usual answer
BlobstoreInputStream received an invalid blob key
How do I get the file using the ObjectName???
If you have persisted and retrieved e.g the string String objname worth e.g "/gs/bucketname/819892hjd81dh19gf872g8211", you could split it on "/" (String[] pieces = objname.split("/")) and use the pieces appropriately in the call to GcsFileName.
I wrote a Google App Engine application that makes use of Blobstore to save programmatically-generated data. To do so, I used the Files API, which unfortunately has been deprecated in favor to Google Cloud Storage. So I'm rewriting my helper class to work with GCS.
I'd like to keep the interface as similar as possible as it was before, also because I persist BlobKeys in the Datastore to keep references to the files (and changing the model of a production application is always painful). When i save something to GCS, i retrieve a BlobKey with
BlobKey blobKey = blobstoreService.createGsBlobKey("/gs/" + fileName.getBucketName() + "/" + fileName.getObjectName());
as prescribed here, and I persist it in the Datastore.
So here's the question: the documentation tells me how to serve a GCS file with blobstoreService.serve(blobKey, resp); in a servlet response, BUT how can I retrieve the file content (as InputStream, byte array or whatever) to use it in my code for further processing? In my current implementation I do that with a FileReadChannel reading from an AppEngineFile (both deprecated).
Here is the code to open a Google Storage Object as Input Stream. Unfortunately, you have to use bucket name and object name and not the blob key
GcsFilename gcs_filename = new GcsFilename(bucket_name, object_name);
GcsService service = GcsServiceFactory.createGcsService();
ReadableByteChannel rbc = service.openReadChannel(gcs_filename, 0);
InputStream stream = Channels.newInputStream(rbc);
Given a blobKey, use the BlobstoreInputStream class to read the value from Blobstore, as described in the documentation:
BlobstoreInputStream in = new BlobstoreInputStream(blobKey);
You can get the cloudstorage filename only in the upload handler (fileInfo.gs_object_name) and store it in your database. After that it is lost and it seems not to be preserved in BlobInfo or other metadata structures.
Google says:
Unlike BlobInfo metadata FileInfo metadata is not persisted to
datastore. (There is no blob key either, but you can create one later
if needed by calling create_gs_key.) You must save the gs_object_name
yourself in your upload handler or this data will be lost.
Sorry, this is a python link, but it should be easy to find something similar in java.
https://developers.google.com/appengine/docs/python/blobstore/fileinfoclass
Here is the Blobstore approach (sorry, this is for Python, but I am sure you find it quite similar for Java):
blob_reader = blobstore.BlobReader(blob_key)
if blob_reader:
file_content = blob_reader.read()