GAE: Access blobstore content programmatically? - java

My Usecase: I don't use the blobstore for uploading and downloading files. I use the blobstore to store very large strings my program is creating. With persisting the path where the blob is stored, I can later load the string again (see documentation)
My Question: Is there an easier way to access the blob content without the need to store the path? BlobstoreService only allows me to directly serve to the HttpServletReponse.

You only need to store a BlobKey - there is never a need to store a path (files or not).
To access contents of a blob:
BlobstoreService blobStoreService = BlobstoreServiceFactory.getBlobstoreService();
String myString =
new String(blobStoreService.fetchData(blobKey, 0, BlobstoreService.MAX_BLOB_FETCH_SIZE-1);
EDIT:
If you have a very long string, you can use any standard way of reading a byte array into a string by fetching data from a blob in a loop.

I guess when you said "persisting the path where the blob is stored", you mean BlobKey?
FileService allows you to directly access blob data:
// Get a file service
FileService fileService = FileServiceFactory.getFileService();
// Get a file backed by blob
AppEngineFile file = fileService.getBlobFile(blobKey)
// get a read channel
FileReadChannel readChannel = fileService.openReadChannel(file, false);
// Since you store a String, I guess you want to read it a such
BufferedReader reader = new BufferedReader(Channels.newReader(readChannel, "UTF8"));
// Do this in loop unitil all data is read
String line = reader.readLine();

Related

How to fetch Azure Blob Content Using Java from Azure Functions

I am creating Azure function using Java, My requirement I need to copy blob from one container to another container with encryption
so, for encrypting blob I am adding 4bites before and after the blob while uploading to sink container
now, I need to fetch blob content, for this I found one class in azure i.e,
#BlobInput(
name = "InputFileName",
dataType = "binary",
path = sourceContainerName+"/{InputFileName}")
byte[] content,
Here byte[] content, fetching content of blob
but I am facing some errors like, if I pass any file name as InputFileName parameter it is giving 200ok means returning successful. also it is difficult to mefor exception handling
so I am looking for other ways for fetching blob content.... please answer me if any methods or classes we have
If you are looking for more control, instead of using the bindings, you can use the Azure Storage SDK directly. Check out the quickstart doc for getting
setup.
This sample code has full end-to-end code that you could build upon. Here is the code that you are looking for in it for reference
String data = "Hello world!";
InputStream dataStream = new ByteArrayInputStream(data.getBytes(StandardCharsets.UTF_8));
/*
* Create the blob with string (plain text) content.
*/
blobClient.upload(dataStream, data.length());
dataStream.close();
/*
* Download the blob's content to output stream.
*/
int dataSize = (int) blobClient.getProperties().getBlobSize();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream(dataSize);
blobClient.downloadStream(outputStream);
outputStream.close();

Is it possible to read the beginning of an S3 Inputstream more than once?

I'm currently getting a ResponseInputStream<GetObjectResponse> from the S3Client (SDK 2), read it into a byte array and open two ByteArrayInputStream to pass them to Apache Tika and ImageIO.read.
Tika detectes the mimeType, BufferedImage is used to get height and width. Now both operations do not need to read the whole file (at least not for all image types). But reading into a byte array requires the consumption of the whole file.
Now how could I open two streams and just discard it when I'm done? Is the only way to perform two getObject calls to S3? Mark and reset isn't supported by the SDK.
One possible way is while uploading the image, if you add the metadata info in the request, then you only need to call GetObjectMetadata method and you can get the information that you need without retrieving the whole object again.
s3Client.putObject(bucketName, stringObjKeyName, "Uploaded String Object");
// Upload a file as a new object with ContentType and title specified.
PutObjectRequest request = new PutObjectRequest(bucketName, fileObjKeyName, new File(fileName));
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("plain/text");
metadata.addUserMetadata("title", "someTitle");
request.setMetadata(metadata);
s3Client.putObject(request);

How to read the contents of an uploaded blob?

I'm using Blobstore to upload a simple text file using this doc: https://cloud.google.com/appengine/docs/java/blobstore/#Java_Uploading_a_blob . I understand from the docs how to save and serve the blob to users, but I don't understand how can my servlet that handle the file upload actually read the contents of the text file?
I found the answers. This is the code:
Map<String, List<FileInfo>> infos = blobstoreService.getFileInfos(request);
Long fileSize = infos.get("myFile").get(0).getSize();
Map<String, List<BlobKey>> blobKeys = blobstoreService.getUploads(request);
byte[] fileBytes =
blobstoreService.fetchData(blobKeys.get("myFile").get(0), 0, fileSize);
String input = new String(fileBytes);
In python there is the BlobReader class to help you do this. (https://cloud.google.com/appengine/docs/python/blobstore/blobreaderclass)
It seems like you are using Java though. There does not seem to be an equivalent class in Java. What I would do is to use GCS as the backing for you blobstore (https://cloud.google.com/appengine/docs/java/blobstore/#Java_Using_the_Blobstore_API_with_Google_Cloud_Storage). This way the files uploaded to the blobstore will be accessibly in GCS.
You can then read the file using the GCS client library for Java.

Get a Google Cloud Storage file from its BlobKey

I wrote a Google App Engine application that makes use of Blobstore to save programmatically-generated data. To do so, I used the Files API, which unfortunately has been deprecated in favor to Google Cloud Storage. So I'm rewriting my helper class to work with GCS.
I'd like to keep the interface as similar as possible as it was before, also because I persist BlobKeys in the Datastore to keep references to the files (and changing the model of a production application is always painful). When i save something to GCS, i retrieve a BlobKey with
BlobKey blobKey = blobstoreService.createGsBlobKey("/gs/" + fileName.getBucketName() + "/" + fileName.getObjectName());
as prescribed here, and I persist it in the Datastore.
So here's the question: the documentation tells me how to serve a GCS file with blobstoreService.serve(blobKey, resp); in a servlet response, BUT how can I retrieve the file content (as InputStream, byte array or whatever) to use it in my code for further processing? In my current implementation I do that with a FileReadChannel reading from an AppEngineFile (both deprecated).
Here is the code to open a Google Storage Object as Input Stream. Unfortunately, you have to use bucket name and object name and not the blob key
GcsFilename gcs_filename = new GcsFilename(bucket_name, object_name);
GcsService service = GcsServiceFactory.createGcsService();
ReadableByteChannel rbc = service.openReadChannel(gcs_filename, 0);
InputStream stream = Channels.newInputStream(rbc);
Given a blobKey, use the BlobstoreInputStream class to read the value from Blobstore, as described in the documentation:
BlobstoreInputStream in = new BlobstoreInputStream(blobKey);
You can get the cloudstorage filename only in the upload handler (fileInfo.gs_object_name) and store it in your database. After that it is lost and it seems not to be preserved in BlobInfo or other metadata structures.
Google says:
Unlike BlobInfo metadata FileInfo metadata is not persisted to
datastore. (There is no blob key either, but you can create one later
if needed by calling create_gs_key.) You must save the gs_object_name
yourself in your upload handler or this data will be lost.
Sorry, this is a python link, but it should be easy to find something similar in java.
https://developers.google.com/appengine/docs/python/blobstore/fileinfoclass
Here is the Blobstore approach (sorry, this is for Python, but I am sure you find it quite similar for Java):
blob_reader = blobstore.BlobReader(blob_key)
if blob_reader:
file_content = blob_reader.read()

How to read content of a blob and write to GAE datastore (Java)

How to read content of a blob and write to GAE datastore in java.
Once you have the BlobKey for the blob you want to read, you can construct a BlobstoreInputStream:
BlobKey blobKey = ...;
InputStream is = new BlobstoreInputStream(blobKey)
You can then read the blob contents using any of the InputStream read methods.
You can use FileService API to create/write/read files in Blobstore. When you read byte array from file, then you can easily add as a property to Datastore entity and save it.

Categories

Resources