I am building a small REST API service to store and retrieve photos. For that, I am using S3 as following:
public String upload(InputStream uploadedInputStream,
Map<String, String> metadata, String group, String filename) {
TransferManager tm = TransferManagerBuilder.standard()
.withS3Client(amazonS3)
.build();
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setContentType(metadata.get(Configuration.CONTENT_TYPE_METADATA_KEY));
// TODO: 26/06/20 Add content-type to metadata
String filepath = group + "/" + filename;
s3transferManager.upload(new PutObjectRequest(
configuration.getProperty("aws.s3.bucket"),
filepath,
uploadedInputStream,
objectMetadata)).waitForUploadResult();
return amazonS3.getUrl(configuration.getProperty("aws.s3.bucket"), filepath).toString();
}
url returned by the function looks like https://photos.tarkshala.com.s3.ap-south-1.amazonaws.com/default-group/1593911534320%230. When accessed it shows up like this
When I open it using the object url(https://s3.ap-south-1.amazonaws.com/photos.tarkshala.com/default-group/1593911534320%230) given in AWS S3 console it shows up fine.
Why getUrl method not returning the second url or is there a way to get second method/api that does it?
This is because of recent changes by AWS regarding s3.
When using virtual hosted–style buckets with SSL, the SSL wild-card
certificate only matches buckets that do not contain dots ("."). To
work around this, use HTTP or write your own certificate verification
logic. For more information, see Amazon S3 Path Deprecation Plan.
amazon-s3-path-deprecation-plan-the-rest-of-the-story
Create a bucket without a dot or use the path style URL or you check VirtualHostingCustomURLs.
S3 support two types of URL to access Object.
Virtual hosted style access
https://bucket-name.s3.Region.amazonaws.com/key name
Path-Style Requests
https://s3.Region.amazonaws.com/bucket-name/key name
Important
Buckets created after September 30, 2020, will support only virtual
hosted-style requests. Path-style requests will continue to be
supported for buckets created on or before this date. For more
information, see Amazon S3 Path Deprecation Plan – The Rest of the
Story.
S3 VirtualHosting
Related
In general, I need to create an app with java that will perform some operations on azure storage
like upload file, append to file, rename, check if exist and so on. And IMPORTANT It has to communicate with DFS endpoint https://xxxx.dfs.core.windows..
But I encounter some problems:
during using BlobContainerClient and uploading a file to azure storage an error appears:
com.azure.storage.blob.models.BlobStorageException: Status code 400,
"{"error":{"code":"MissingRequiredHeader","message":"An HTTP header
that's mandatory for this request is not
specified.\nRequestId:b225d695-201f-00ed-212e-c7c9e8000000\nTime:2021-10-22T10:23:12.4983407Z"}}"
How can I avoid this situation, what header is required and how to set up it?
Afterward I have implemented something similar but using DataLakeFileSystemClient and this time uploading of the file was totally fine. Unfortunately, not all operations can be performed. e.g. exists() method internally uses BlobContainerClient
and perform call via blob endpoint https://xxxx.blob.core.windows.. what if forbidden in my case.
IMO It is caused because BlobContainerClientBuilder.endpoint(String endpoint) set up endpoint blobContainerClient
endpoint to blob, and dfs endpoint for DataLakeFileSystemClient.
source code:
public DataLakeFileSystemClientBuilder endpoint(String endpoint) { // Ensure endpoint provided is dfs endpoint endpoint = DataLakeImplUtils.endpointToDesiredEndpoint(endpoint, "dfs", "blob"); blobContainerClientBuilder.endpoint(DataLakeImplUtils.endpointToDesiredEndpoint(endpoint, "blob", "dfs"));
So the question is: is it a bug in BlobContainerClientBuilder.endpoint(String endpoint) ?
or how to fix this problem to use the same endpoint for both clients.
Currently, I have implemented wcomunicatend I'm using both clients: DataLakeFileSystemClient to perform actions
like upload, append etc. and BlobContainerClient to check if file exist. I would like to use only one of the clients.
Could you help me somehow, please?
Azure Blob Storage is developed for storing large amount of unstructured data. And Unstructured data does not adhere to a particular data model or definition, such as text or binary data.
Blob Storage provides three resources, which are Storage Account (SA), Container inside SA and a Blob in the Container. And we use some Java classes to interact with these resources.
The BlobContainerClient class allows you to manipulate Azure Storage Containers and their Blobs. This class is mainly used to manipulate or work on the Containers (file system). So if you want to work on or manipulate Blobs (files) then it's recommended to use the BlobClient.
Check the following snippets to create a container and uploading a file.
Create a container using a BlobContainerClient.
blobContainerClient.create();
Upload BinaryData to a blob using a BlobClient generated from a BlobContainerClient.
BlobClient blobClient = blobContainerClient.getBlobClient("myblockblob");
String dataSample = "samples";
blobClient.upload(BinaryData.fromString(dataSample));
And to rename a blob (file), Copy and Delete is the only method of renaming a blob. If you want to do so for larger blobs, you need to use the asynchronous copy and check periodically for its completion.
Check this Manage blobs with Java v12 SDK and Azure Storage Blob client library for Java document for more information.
I have a JAVA method that gets me an URL, using which I can upload an object to the bucket in Google Cloud Storage. The generated URL is valid for 30 seconds(tested ok).
public String getSignedUploadLink(String bucketName, String objectName, String mimeType)
throws Exception {
try {
// check if bucket exists, and create one with notification, if it doesn't.
if (!bucketExists(bucketName)) {
createBucket(bucketName);
}
// Define Resource
BlobInfo blobInfo = BlobInfo.newBuilder(BlobId.of(bucketName, objectName)).build();
// Specify the object's content type.
Map<String, String> extensionHeaders = new HashMap<>();
extensionHeaders.put("Content-Type", mimeType);
//setting it to expire in 30 seconds
return storage.signUrl(
blobInfo,
30,
TimeUnit.SECONDS,
Storage.SignUrlOption.httpMethod(HttpMethod.PUT),
Storage.SignUrlOption.withExtHeaders(extensionHeaders),
Storage.SignUrlOption.withV4Signature()).toString();
} catch (Exception e) {
// Handle the caught exception
}
}
Using HTTP PUT in the link generated by the above method, I can successfully upload an object in the Google Cloud Storage. The weird thing is, I'm also able to upload another object (with the same content type, of course) if I send another PUT request to the same URL before the expiration period.
Is it supposed to work like this? I was under the impression that, once an upload link is used to upload something, it invalidates itself, regardless of the expiration period. Could it be so that I'm missing something here? Could really use some help with this.
Yes, that's the way it works. There is nothing in the documentation for signed URLs that suggests that they invalidate themselves by any condition other than the time you specify when you create it. If you say it's good for 30 seconds, then that's how long the URL can be used.
I'm uploading a mp4 video to AWS S3 using a pre-signed URL, the upload succeeds but when I try to download the video from S3 and play it in a media player (VLC or quickTime), it doesn't play!.
Generated pre-signed URL works fine with mp3 but the same problem as above also occurs for WAV and FLAC.
Code to generate the pre-signed url:
public String getPreSignedS3Url( final String userId, final String fileName )
{
Date expiration = new Date();
long expTimeMillis = expiration.getTime();
expTimeMillis += urlExpiry;
expiration.setTime(expTimeMillis);
String objectKey = StringUtils.getObjectKey( userId, fileName );
GeneratePresignedUrlRequest generatePresignedUrlRequest = new GeneratePresignedUrlRequest(
recordingBucketName, objectKey)
.withMethod(HttpMethod.PUT)
.withExpiration(expiration);
URL url = s3Client.generatePresignedUrl(generatePresignedUrlRequest);
return url.toString();
}
After I get the pre-signed URL from the method above, I make a HTTP PUT request from Postman with the multipart/form-data in the request body like this:
-H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
-F 'file=#/Users/john/Downloads/sampleDemo.mp4'
pre-signed url looks like this:
https://meeting-recording.s3.eu-west-2.amazonaws.com/331902257/sampleDemo.mp4?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20190720T125751Z&X-Amz-SignedHeaders=host&X-Amz-Expires=3599&X-Amz-Credential=AKIAZDSMLZ3VDKNXQUXH%2F20190720%2Feu-west-2%2Fs3%2Faws4_request&X-Amz-Signature=dfb8054f0738e07e925e9880e4a8e5ebba0a1bd3c84a3ec78913239f65221992
I tried to set the content type to mp4 in the getPreSignedS3Url() method using generatePresignedUrlRequest.setContentType( "video/mp4" ); and add Content-Type : "video/mp4" in the HTTP PUT request header but it didn't work and it fails with an error Signature doesn't match.
I'm using S3 as my personal back-up hard-drive, I expect to upload video and audio files to S3 using a pre-signed URL, download them at some point in the future and be able to play them, but I'm unable to play them after I have downloaded them.
Does anyone know what could be causing this?
PUT requests to S3 don't support multipart/form-data. The request body needs to contain nothing but the binary object data. If you download your existing file from S3 and open it with a text editor, you'll find that S3 has preserved the multipart form structure inside the file, instead of interpreting it as a wrapper for the actual payload.
To add to the above answer (for future readers), I was using formData() like below to send the put request. Adding the file directly to the payload worked for me.
Don't do this:
var bodyData = new FormData();
bodyData.append('file', video);
axios.put(res?.data?.uploadURL, bodyData, {
Instead, do this:
axios.put(res?.data?.uploadURL, video, {
I create data in the server (gae) and I want to store it in Blobstore. I saw many answers on how to do this giving a BlobStore URL to the client, but there is no client or HTTP request: it's just an asynchronous task.
Then I guess I should use createUploadUrl(), and instead of giving this URL to a client, from my code HTTP Post my data to it via URL Fetch. This looks weird, isn't there another API for this?
Let's say that the files I want in Blobstore are already stored in my GCS default bucket. Can I just tell Blobstore about them using the gcs location "/gs/bucketname/file"? I tried this by
GcsFilename filename = new GcsFilename(bucketName, fileId);
String gcsKey = "/gs/" + bucketName + "/" + filename.getObjectName();
BlobKey blobKey = blobstoreService.createGsBlobKey(gcsKey);
GcsOutputChannel outputChannel = gcsService.createOrReplace(filename, GcsFileOptions.getDefaultInstance());
ObjectOutputStream oout = new ObjectOutputStream(Channels.newOutputStream(outputChannel));
oout.writeObject(myDataObjectToPersist);
oout.close();
// ...at some other point I have checked the file is correctly stored in
// GCS and I can fetch it using /gs/bucket/fileId
// but it doesn't seem to be in Blobstore, so when
InputStream stream = new BlobstoreInputStream(new BlobKey(blobKey.keyString))
// ... this gives a BlobstoreInputStream$BlobstoreIOException: BlobstoreInputStream received an invalid blob key...
Is this something conceptually wrong - like if I use GcsOutputChannel to save it I will not get it from Blobstore even if I create a BlobKey, or is it something that could work but I just did something wrong?
1K thanks
Why would you want to store the file in blobstore as opposed to writing and reading it directly from GCS?
Yes, you can create a BlobKey for a file stored in GCS, and can use the key in some of the blobstore API (such as fetchData and serve) but unfortunately not in all.
Some of the blobstore API (such as BlobstoreInputStream) depends on BlobInfo and that is not created when using the GCS client.
I'm updating existing objects in an Amazon S3 bucket to set some metadata. I'd like to set the HTTP Expires header for each object to better handle HTTP/1.0 clients.
We're using the AWS Java SDK, which allows for metadata changes to an object without re-uploading the object content. We do this using CopyObjectRequest to copy an object to itself. The ObjectMetadata class allows us to set the Cache-Control, Content-Type and several other headers. But not the Expires header.
I know that S3 stores and serves the Expires header for objects PUT using the REST API. Is there a way to do this from the Java SDK?
Updated to indicate that we are using CopyObjectRequest
To change the metadata of an existing Amazon S3 object, you need to copy the object to itself and provide the desired new metadata on the fly, see copyObject():
By default, all object metadata for the source object are copied to
the new destination object, unless new object metadata in the
specified CopyObjectRequest is provided.
This can be achieved like so approximately (fragment from the top of my head, so beware):
AmazonS3 s3 = new AmazonS3Client();
String bucketName = "bucketName ";
String key = "key.txt";
ObjectMetadata newObjectMetadata = new ObjectMetadata();
// ... whatever you desire, e.g.:
newObjectMetadata.setHeader("Expires", "Thu, 21 Mar 2042 08:16:32 GMT");
CopyObjectRequest copyObjectRequest = new CopyObjectRequest()
.WithSourceBucketName(bucketName)
.WithSourceKey(key)
.WithDestinationBucket(bucketName)
.WithDestinationKey(key)
.withNewObjectMetadata(newObjectMetadata);
s3.copyObject(copyObjectRequest);
Please be aware of the following easy to miss, but important copyObject() constraint:
The Amazon S3 Acccess Control List (ACL) is not copied to the new
object. The new object will have the default Amazon S3 ACL,
CannedAccessControlList.Private, unless one is explicitly provided in
the specified CopyObjectRequest.
This is not accounted for in my code fragment yet!
Good luck!
We were looking for a similar solution and eventually settled for max-age cache-control directive. And we eventually realized that hte cache-control overrides the Expires even if expires is more restrictive. And anyways cache-control met our requirement as well.