I'm migrating my GAE app from the deprecated File API to Google Cloud Storage Client Library.
I used to persist the blobKey, but since there is partial support for it (as specified here) from now on I'll have to persist the object name.
Unfortunately the object name that comes from the GCS looks more or less like this
/gs/bucketname/819892hjd81dh19gf872g8211
as you can see, it also contains the bucket name
Here's the issue, every time I need to get the file for further processing (or to serve it in a servlet) I need to create an instance of GcsFileName(bucketName, objectName) which gives me something like
/bucketName/gs/bucketName/akahsdjahslagfasgfjkasd
which (of course) doesn't work.
so. my question is:
- how can I generate a GcsFileName form the objectName?
UPDATE
I tried using the objectName as BlobKey. But it just doesn't work :(
InputStream is = new BlobstoreInputStream(blobstoreService.createGsBlobKey("/gs/bucketName/akahsdjahslagfasgfjkasd"));
I got the usual answer
BlobstoreInputStream received an invalid blob key
How do I get the file using the ObjectName???
If you have persisted and retrieved e.g the string String objname worth e.g "/gs/bucketname/819892hjd81dh19gf872g8211", you could split it on "/" (String[] pieces = objname.split("/")) and use the pieces appropriately in the call to GcsFileName.
Related
I want to upload an object to Amazon versioned bucket (using Java AWS SDK) and set a custom version to this object (goal is to set the same version to all objects, uploaded at once)
PutObjectResult por = amazonS3Client.putObject(...);
por.setVersionId("custom_version");
So, is it the right way to set a version to the uploaded object?
Does this code lead to 2 separate requests to Amazon?
What if Internet is broken while por.setVersionId(..) is being called?
Why por.setVersionId(..) does not throw an exception such as SdkClientException if this method really is trying to set a version ID on Amazon server?
setVersionId would be something the SDK library itself uses to populate the versionId returned by the service when the object is created, so that you can retrieve it if you want to know what it is.
Version IDs in S3 are system-generated opaque strings that uniquely identify a specific version of an object. You can't assign them.
The documentation uses some unfortunate examples like "111111" and "222222," which do not resemble real version-ids. There's a better example further down the page, where you'll find this:
Unique version IDs are randomly generated, Unicode, UTF-8 encoded, URL-ready, opaque strings that are at most 1024 bytes long. An example version ID is 3/L4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo.
Only Amazon S3 generates version IDs.
They cannot be edited.
You don't get an error here because all this method does is set the versionId inside the PutObjectResult object in local memory after the upload has finished. It succeeds, but serves no purpose.
To store user-defined metadata with objects, such as your release/version-id, you'd need to use object metadata (x-amz-meta-*) or the new object tagging feature in S3.
I have a JSON that looks more or less like this:
{"id":"id","date":"date","csvdata":"csvdata".....}
where csvdata property is a big amount of data in JSON format too.
I was trying to POST this JSON using AJAX in Play! Framework 1.4.x so I sended just like that, but when I receive the data in the server side, the csvdata looks like [object Object] and stores it in my db.
My first thought to solve this was to send the csvdata json in string format to store it like a longtext, but when I try to do this, my request fails with the following error:
413 (Request Entity Too Large)
And Play's console show me this message:
Number of request parameters 3623 is higher than maximum of 1000, aborting. Can be configured using 'http.maxParams'
I also tried to add http.maxParams=5000 in application.conf but the only result is that Play's console says nothing and in my database this field is stored as null.
Can anyone help me, or maybe suggest another solution to my problem?
Thanks you so much in advance.
Is it possible that you sent "csvdata" as an array, not a string? Each element in the array would be a separate parameter. I have sent 100KB strings using AJAX and not run into the http.maxParams limit. You can check the contents of the request body using your browser's developer tools.
If your csvdata originates as a file on the client's machine, then the easiest way to send it is as a File. Your controller action would look like:
public static void upload(String id, Date date, File csv) {
...
}
When Play! binds a parameter to the File type, it writes the contents of the parameter to a temporary file which you can read in. (This avoids running out of memory if a large file is uploaded.) The File parameter type was designed for a normal form submit, but I have used it in AJAX when the browser supported some HTML5 features (File API and Form Data).
Is there a way to delete a resource from AWS S3 using the java sdk by URL?
I know you can delete a resource using a bucket name and keyName like this:
s3client.deleteObject(new DeleteObjectRequest(bucketName, keyName));
The issue is that I only have access to the resourceURL, so I would need to manipulate the string to extract the bucketname and keyname.
But if there was a way to delete by passing the url would be much cleaner.
There doesn't appear to be a way to simply pass the URL.
There's this, though:
AmazonS3URI
public AmazonS3URI(String str)
Creates a new AmazonS3URI by parsing the given string. String will be URL encoded before generating the URI.
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3URI.html
You can call getKey and getBucket on it to extract the strings you need. It's still messy, but at least it looks like you don't have to write your own parser.
I wrote a Google App Engine application that makes use of Blobstore to save programmatically-generated data. To do so, I used the Files API, which unfortunately has been deprecated in favor to Google Cloud Storage. So I'm rewriting my helper class to work with GCS.
I'd like to keep the interface as similar as possible as it was before, also because I persist BlobKeys in the Datastore to keep references to the files (and changing the model of a production application is always painful). When i save something to GCS, i retrieve a BlobKey with
BlobKey blobKey = blobstoreService.createGsBlobKey("/gs/" + fileName.getBucketName() + "/" + fileName.getObjectName());
as prescribed here, and I persist it in the Datastore.
So here's the question: the documentation tells me how to serve a GCS file with blobstoreService.serve(blobKey, resp); in a servlet response, BUT how can I retrieve the file content (as InputStream, byte array or whatever) to use it in my code for further processing? In my current implementation I do that with a FileReadChannel reading from an AppEngineFile (both deprecated).
Here is the code to open a Google Storage Object as Input Stream. Unfortunately, you have to use bucket name and object name and not the blob key
GcsFilename gcs_filename = new GcsFilename(bucket_name, object_name);
GcsService service = GcsServiceFactory.createGcsService();
ReadableByteChannel rbc = service.openReadChannel(gcs_filename, 0);
InputStream stream = Channels.newInputStream(rbc);
Given a blobKey, use the BlobstoreInputStream class to read the value from Blobstore, as described in the documentation:
BlobstoreInputStream in = new BlobstoreInputStream(blobKey);
You can get the cloudstorage filename only in the upload handler (fileInfo.gs_object_name) and store it in your database. After that it is lost and it seems not to be preserved in BlobInfo or other metadata structures.
Google says:
Unlike BlobInfo metadata FileInfo metadata is not persisted to
datastore. (There is no blob key either, but you can create one later
if needed by calling create_gs_key.) You must save the gs_object_name
yourself in your upload handler or this data will be lost.
Sorry, this is a python link, but it should be easy to find something similar in java.
https://developers.google.com/appengine/docs/python/blobstore/fileinfoclass
Here is the Blobstore approach (sorry, this is for Python, but I am sure you find it quite similar for Java):
blob_reader = blobstore.BlobReader(blob_key)
if blob_reader:
file_content = blob_reader.read()
I'm hoping the answer to this question is quite simple, but I can't get it working after looking at the Azure Java API documentation.
I am trying to create an empty CloudBlockBlob, which will have blocks uploaded to it at a later point. I have successfully uploaded blocks before, when the blob is created upon the first block being uploaded, but I can't seem to get anything other than ("the specified blob does not exist") when I try to create a new blob without any data and then access it. I require this because in my service, a call is first made to create the new blob in Azure, and then later calls are used to upload blocks (at which point a check is made to see if the blob exists). Is it possible to create an empty blob in Azure, and upload data to it later? What have I missed?
I've not worked with Java SDK so I may be wrong but I tried creating an empty blob using C# code (storage client library 2.0) and if I upload an empty input stream an empty blob with zero byte size is created. I did something like the following:
CloudBlockBlob emptyBlob = blobContainer.GetBlockBlobReference("emptyblob.txt");
using (MemoryStream ms = new MemoryStream())
{
emptyBlob.UploadFromStream(ms);//Empty memory stream. Will create an empty blob.
}
I did look at Azure SDK for Java source code on Github here: https://github.com/WindowsAzure/azure-sdk-for-java/blob/master/microsoft-azure-api/src/main/java/com/microsoft/windowsazure/services/blob/client/CloudBlockBlob.java and found this "upload" function where you can specify an input stream. Try it out and see if it works for you.